in reply to Re: Backreferences in negated character classes
in thread Backreferences in negated character classes

Many thanks for all of the responses. GrandFather, you were right on target as always; tye, thanks for cutting to the heart of the problem and pointing out a duh moment for me. :-)

The first problem is that character classes are constructed when the regexp is compiled, and do not change during the matching process. Because of that the special syntax for backreferences in regexps does not extend inside the character class, so as tye mentioned the '\2' is actually treated as ASCII character 2.
the ($!\2) in your example actually interpolated the $! error variable into your regexp

Thank you for stating that so explicitly - that was a core piece of knowledge that I was missing. Now I understand why my incorrect negative lookahead was matching four characters. I can't believe I missed the obvious typo in the lookahead ($! instead of ?!. I guess that's what I get for playing with regexen so late at night. :-)

This gives you something nice and regular - it would be quite easy to write code to generate the above from the example string. Here's how it might work:

Thanks for the great example for building this type of regex on the fly. I wanted to capture the whole match, so I changed it as follows:

my $regex = mkre($s); while( $string =~ m/$regex/g ) { print $1, "\n"; # do other stuff } sub mkre { my $s = shift; my $index = 1; # using \1 to capture the whole match my(%seen, @elems); for (split //, $s) { if ($seen{$_}) { push @elems, "\\$seen{$_}"; } else { push @elems, sprintf '(?! %s)', join ' | ', map "\\$_", 2 .. $in +dex if $index > 1; # changed to start with \2 $seen{$_} = ++$index; push @elems, '(\\w)'; } } my $re = join( ' ', '(', @elems, ')' ); # create \1 warn "$s: $re\n" if $DEBUG; qr/$re/x; }

Then I realized I could have left the sub as-is and just printed $& instead. :-)

Thanks again for the help, and for such a elegant solution.

Update: japhy++ Very nice solution - taking that approach would enable me to create much more flexible (and more powerful) regexps. Thanks for posting it.