in reply to Re: Regular Expression
in thread Regular Expression

Thanks, this is useful. However, it should come with two caveats for the unwary:

  1. This cheat sheet is not Perl-specific. For example, there is no /U modifier in Perl for ‘ungreedy patterns.’
  2. See comment #40 by merlyn: “Your email regex is wrong” (referring to a sample pattern on the png version of the cheat sheet).

Athanasius <°(((><contra mundum

Replies are listed 'Best First'.
Re^3: Regular Expression
by Socrates440 (Acolyte) on Jun 21, 2012 at 05:00 UTC
    After reading the resources that you all posted, and re-reading my chapter, I have mangaed to narrow my question. Obviously, the way that I am reading this code is incorrect because I don't think that it should work but when I tested it I found that it did. Please tell me where my error in thinking is. It comes straight from my perl book. It is supposed to parse a string in order to determine if the string contains aeiou in order.
    #!/usr/bin/perl -w while (<>) { print if (/^[^aeiou]*a[^eiou]*e[^aiou]*i[^aeou]*o[^aeiu]*u[^aeio]*$/); }
    It is my understanding that the code should match if and only if there is a non vowel character at the beginning of the string, an a followed by any number of characters except for other vowels, an e followed by any number of characters except for any number of vowels etc... and finally a u followed by any number of characters that are not other vowels at the end of the string. My questions: -The ^ anchor before the first set of brackets seems to me to say that the string must begin with a non vowel character in order to match. I tested this theory and as long as the vowels are in order the string can begin with a non vowel character. How does that work? -Why does the if statement not require brackets? -Why is the $ anchor necessary at the end? Wouldn't the regular expression do the same thing if it were left off? Thanks!!!
      there is a non vowel character at the beginning of the string
      No, and this is the misunderstanding. * means zero or more, so there can also be no non-vowel character at the beginning (and anywhere in between, either).

      It is my understanding that the code should match if and only if there is a non vowel character at the beginning of the string...

      No, see Quantifiers in perlre, and contrast * with + (see choroba’s answer above).

      Why does the if statement not require brackets?

      See Statement Modifiers in perlsyn.

      Why is the $ anchor necessary at the end?

      In this particular case, it probably isn’t. Update: See the answer by MidLifeXis below.

      HTH,

      Athanasius <°(((><contra mundum

      The pattern also has one other restriction - that there are exactly five vowels, plus other characters in between. Without the final set of non-vowel matches and the end of string character, this restriction would not be enforced.

      --MidLifeXis