Re: \1, \2, \3, ... inside of a character class

Here is one way to build a character class based on a backreference. Note, I've had to use the (??{....}) construct, and I'm not positive (without diving again into the gory details of parsing) whether I'm relying on defined behavior or just happenstance. But it works!


use strict;
use warnings;

while ( my $string = <DATA> ) {
    chomp $string;
    if ( $string =~ m/(\w+)\s((??{"[$1]+"}))/ ) {
        print "$string => matched: $1, $2!\n";
    } else {
        print "$string => Didn't match.\n";
    }
}


__DATA__
abcde fgh
abcde eadcabe
[download]

With that snippet, the first line will fail to match, and the second line will succeed, because the second half of the second line contains only those characters found in the first subset. This could probably be accomplished with greater simplicity by just breaking it down into smaller regexps that cascade from one to the next, but I couldn't resist the challenge of doing it in one.

Update: Having just re-read perlre, I'm satisfied that I'm relying on defined (though "experimental") behavior. The (??{...}) subexpression is a sort of postponed regular subexpression, and it should have full access to the $n ($1, $2, etc.) special variables for any parens that have actuall matched so far. I could also have written the (??{...}) subexpression as

(??{"[$^N]+"})
[download]

...because $^N is the same as the $1, $2, etc. variables but contains the most recent successful capturing subexpression.

Dave

Comment on Re: \1, \2, \3, ... inside of a character class Select or Download Code