in reply to Empty pattern in regex

Why is "f" printed?

I would have expected the question 'why are "f" and "g" printed'. Do you agree that printing "g" is also surprising, for the same reason? (If not, I may be misunderstanding random parts of your post.)

Why is there the empty regex (see (*))?

I don't know, seems very odd to me. I suggest reporting it as a possible bug.

It seems possible that since the last successfully matched regexp was /d/, and the last attempted match against that regexp was a fail, it may have somehow marked it as no longer successfully matched; but that doesn't explain the change of behaviour when you add the empty continue block.

I suspect rather that it is a scoping bug: I'm not sure if the docs make this clear, but it is intended to use the last successfully matched regexp visible to the current scope. Thus:

% perl -wle '"a" =~ /a/; { "b" =~ /b/ } "ab" =~ // and print $&' a %

FWIW p5p mostly regards the empty regexp behaviour as a misfeature reluctantly spared the axe only because of the constraints of backward compatibility - it is very rare to see anyone actually trying to make use of it. But since we have it, it certainly ought to work as advertised.

Replies are listed 'Best First'.
Re^2: Empty pattern in regex
by perlboy_emeritus (Scribe) on Oct 19, 2023 at 18:45 UTC

        > but it is intended to use the last successfully matched regexp

    Whatever its intent, it is one confusing puppy. //; always matches, always returns TRUE, but it never changes $&. $& is always whatever the previous regex set it to, whether it matched or not, effectively a NOP. Consider this:

    $_ = 'Hello Perl'; say '$_ = \'Hello Perl\';'; print "\$1: $1\n\$2: $2\n\$3: $3\n\$&: $&\n\n"; # unsuccessful match /Python/; print "No match, \/Python\/\n"; print "\$1: $1\n\$2: $2\n\$3: $3\n\$&: $&\n\n"; # successful match if (//) { print "No nothing, if (\/\/) {\n"; print "\$1: $1\n\$2: $2\n\$3: $3\n\$&: $&\n\n"; } else { print "\/\/ unsuccessfull match"; } # successful match, no captures /Perl/; print "Match \/Perl\/, No captures\n"; print "\$1: $1\n\$2: $2\n\$3: $3\n\$&: $&\n\n"; # successful match, empty pattern if (//) { print "No nothing, if (\/\/) {\n"; print "\$1: $1\n\$2: $2\n\$3: $3\n\$&: $&\n\n"; } else { print "\/\/ unsuccessfull match"; } # successful match, no captures /Perl/; print "Match \/Perl\/, No captures\n"; print "\$1: $1\n\$2: $2\n\$3: $3\n\$&: $&\n\n"; # successful match, no pattern, empty parens if (/()/) { #//; print "No nothing, if \(\/\(\)\/\) {\n"; print "\$1: $1\n\$2: $2\n\$3: $3\n\$&: $&\n\n"; } else { print "\/\/ unsuccessfull match"; }

    Which results in:

      $_ = 'Hello Perl';
      $1: 
      $2: 
      $3: 
      $&: 
    
      No match, /Python/
      $1:  
      $2: 
      $3: 
      $&: 
    
      No nothing, if (//) {
      $1: 
      $2: 
      $3: 
      $&: 
    
      Match /Perl/, No captures
      $1: 
      $2: 
      $3: 
      $&: Perl
    
      No nothing, if (//) {
      $1: 
      $2: 
      $3: 
      $&: Perl
    
      Match /Perl/, No captures
      $1: 
      $2: 
      $3: 
      $&: Perl
      
      No nothing, if (/()/) {
      $1: 
      $2: 
      $3: 
      $&: 
    

    What is or was the purpose of this construction? How would one use it?

      > What is or was the purpose of this construction? How would one use it?

      #!/usr/bin/perl use warnings; use strict; use feature qw{ say }; $_ = 'abacad'; say "/a(.)/"; if (/a(.)/g) { say "\$1: $1"; say "\$&: $&"; } else { say 'No match'; } for my $try (1 .. 3) { say "//"; if (//g) { say "\$1: $1"; say "\$&: $&"; } else { say 'No match'; } }
      Output:
      /a(.)/ $1: b $&: ab // $1: c $&: ac // $1: d $&: ad // No match

      Update: If I remember correctly, this was the original reason the feature was introduced:

      #!/usr/bin/perl use warnings; use strict; use feature qw{ say }; my $x = 'found 11'; my $y = 'found 12'; if ($x =~ /found (\d+)/ && $y =~ //) { # No need to repeat the long r +egex! Yay! say "Found $1."; }

      map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]

        I changed my mind about sitting this one out because this code from choroba is so fascinating, and worth exploring, as noted below.

        #!/usr/bin/env -S perl #use warnings; # suppress uninitialized warnings use strict; use feature qw{ say }; use experimental qw( signatures ); $_ = 'abXXXXacVVVVVad'; #$_ = 'abacad'; my $lim = 7; say "Examining \'$_\' $lim times"; say "/a(.)/"; if (/a(.)/g) { #say "/q(.)/"; # switch to see no-match example #if (/q(.)/g) { say "\$1: $1"; say "\$&: $&"; } else { say 'No match'; } for my $try (1 .. $lim) { say "//"; if (//g) { say "\$1: $1"; say "\$&: $&"; } else { say 'No match'; } } $_ = '37.5BBBBB98UUUUU4.075QQQQQ42TTTT0.357SSS'; $lim = 5; say "\nExamining \'$_\' $lim times"; say '/[^\d.](\d+(?:\.?\d*)?)/g'; if (/[^\d.](\d+(?:\.?\d*)?)/g) { say "\$1: $1"; say "\$&: $&"; } else { say 'No match'; } for my $try (1..$lim) { say "//"; if (//g) { say "\$1: $1"; say "\$&: $&"; } else { say 'No match'; } } exit(0); __END__
          O U T P U T
        Examining 'abXXXXacVVVVVad' 7 times /a(.)/ $1: b $&: ab // $1: c $&: ac // $1: d $&: ad // No match // $1: b $&: ab // $1: c $&: ac // $1: d $&: ad // No match Examining '37.5BBBBB98UUUUU4.075QQQQQ42TTTT0.357SSS' 5 times /^\d.(\d+(?:\.?\d*)?)/g $1: 98 $&: B98 // $1: 4.075 $&: U4.075 // $1: 42 $&: Q42 // $1: 0.357 $&: T0.357 // No match // $1: 98 $&: B98

        S O M E  O B S E R V A T I O N S
        The only use case I can infer from these code tweaks is the notion of determining the number of 'things' present in a string, as long as at least one of those 'things' is present, since there must be at least one match before //g can work its magic. I deliberately omitted a leading non-decimal in the second $_ to force the first match deeper in the string, but that's cool because as long as at least one 'thing' sub-expression is present, //g will find the rest (to the right). Also, note how the entire process starts over from the beginning if the number of $trys exceeds the number of 'things' present, as in $try (1..$lim) {. There must be other use cases for //g;. Any thoughts on how to exploit this facility?

        Will

      // always matches, always returns TRUE, but it never changes $&.

      That is not correct, in either aspect:

      % perl -wle '"a" =~ /a/; "b" =~ // or print "did not match, did not re +turn TRUE"' did not match, did not return TRUE % perl -wle '"a" =~ /.*/; q{$& changed} =~ // and print $&' $& changed %

        I concede, I give up and I surrender. Too confusing, why I get the result I do and you get the results you do. I'll sit this one out and just study the results y'all get. :-)