in reply to Re: Regex (lookahead) Confusion
in thread Regex (lookahead) Confusion

You are *so* the monk!

The regex works because if the variable ength of the space (0-full line) before matching the backreference. And lookaheads can be variable length.

This is essentially the converse approach to what I was thinking trying to solve the problem. Someday...

while (<DATA>) { chomp; if ( /^(?:([smtwhfa])(?!.*\1))*$/ ) { print "$_ : OK\n"; } else { print "$_ : Not OK\n"; } } __DATA__ swma smqa smsa fhtm ttma t2ms __END__ swma : OK smqa : Not OK smsa : Not OK fhtm : OK ttma : Not OK t2ms : Not OK

--
Allolex

Replies are listed 'Best First'.
Re: Re: Re: Regex (lookahead) Confusion
by ChrisR (Hermit) on Feb 05, 2004 at 22:01 UTC
    So let's see if I understand this and please correct me if I'm wrong.

    The ?: means that the parens are just for grouping and will load any matches into $1, $2, etc.
    Then the character class in parens
    and then the lookahead...
    ?! means that it is a negative lookahead
    .* means any character 0 or more times (very greedy)
    \1 is related to the character matched from the character class
    It's the .* that is throwing me off. To me, that looks like it would match a single character repeated any number of times but not separated duplicate characters. I just don't understand it ... yet. I will keep looking.

      s/will load/will not load/, right?

      The .* is saying there can be any number of characters in the string between the first match (the character class) and the backreference (\1) match, from zero (two characters next to each other) on up...

      Say out first match was 'a', then 'a' is also referred to in \1. Then we have a.*a: aa,  a.a,  a..a,  a...a,  a....a, etc. And its a negative lookahead, so if any one of those combinations matches, the regex will fail.

      --
      Allolex