terrencebrown has asked for the wisdom of the Perl Monks concerning the following question:

I thought I was versed in pattern matching until I came across this code. After deconstructing a Perl Golf solution from mtve, I found myself unable to understand why and how 2 lines of code work.

http://terje.perlgolf.org/

#!/usr/bin/perl for(@a=glob"{\321,.,\321}"x2) { $x = $_; for (@a) { $h = ~$x; print '*' if $_ =~ m/$h/; #Why does this line match? print ' ' if $_ !~ m/$h/; #Why does this line not? #The next line is just for match tracking. $matches{"$_ =~ $h"} = $_ =~ m/$h/ ? '*' : ' '; } print "\n"; } foreach $key(sort keys %matches) { print "$key $matches{$key}\n"; }

Here's the output:

********* * ** ** * ********* *** *** * * * * *** *** ********* * ** ** * ********* .. =~ .. * .. =~ .Ñ .. =~ Ñ. .. =~ ÑÑ .Ñ =~ .. * .Ñ =~ .Ñ * .Ñ =~ Ñ. .Ñ =~ ÑÑ Ñ. =~ .. * Ñ. =~ .Ñ Ñ. =~ Ñ. * Ñ. =~ ÑÑ ÑÑ =~ .. * ÑÑ =~ .Ñ * ÑÑ =~ Ñ. * ÑÑ =~ ÑÑ *

It seems odd to me that ÑÑ =~ Ñ. would match but Ñ. =~ ÑÑ would not. Here's and even simpler version that shows that Ñ =~ . matches and . =~ Ñ does not.

#!/usr/bin/perl for(@a=glob"{\321,.,\321}"x1) { $x = $_; for (@a) { $h = ~$x; print '*' if $_ =~ m/$h/; #Why does this line match? print ' ' if $_ !~ m/$h/; #Why does this line not? #The next line is just for match tracking. $matches{"$_ =~ $h"} = $_ =~ m/$h/ ? '*' : ' '; } print "\n"; } foreach $key(sort keys %matches) { print "$key $matches{$key}\n"; }

Here's the output:

*** * * *** . =~ . * . =~ Ñ Ñ =~ . * Ñ =~ Ñ *

Replies are listed 'Best First'.
Re: Cantor's Revenge Matching
by Paladin (Vicar) on May 01, 2003 at 16:45 UTC
    It seems odd to me that ÑÑ =~ Ñ. would match but Ñ. =~ ÑÑ would not. Here's and even simpler version that shows that Ñ =~ . matches and . =~ Ñ does not.

    . matches any character (except \n by default) so m/Ñ./ matches a Ñ followed by any character, which ÑÑ is. m/ÑÑ/ matches only ÑÑ, which Ñ. is not. The difference is that in a regex, . has special meaning, while in a string, it doesn't.

      So the value in $h is being used as regex code. Somehow I thought the pattern needed to pre-compiled using qr//.

      $h =  qr/../;

      Thank you so much for the answer.

        Correct. Regexen go through double-quotish interpolation (with a few exceptions, like using single quotes as the delimiters) before being passed on to the regex engine, so you can say stuff like:
        my $foo = 'first part'; my $anything = '.*'; my $bar = 'last part'; $_ =~ /$foo$anything$bar/;
        and have it mean the same as:
        $_ =~ /first part.*last part/;
Re: Cantor's Revenge Matching
by queue (Beadle) on May 01, 2003 at 16:38 UTC
    I might be missing something here, but the dot on the LHS is a literal dot character. The dot on the RHS is the dot metacharacter, which matches any single character. So a dot on the LHS isn't going to be matched by anything on the RHS except a literal dot or the dot metacharacter.
Re: Cantor's Revenge Matching
by terrencebrown (Acolyte) on May 01, 2003 at 17:40 UTC

    Very good for both answers. Thank you.