in reply to Empty pattern in regex

Per that perlop discussion I wrapped // in quotes with 'm' and tried:

perl -le 'print for a .. z' | perl -nle 'if (/d/ .. /h/) { next unles +s "m//"; print }'

and got:

  % perl -le 'print for a .. z' | perl  -nle 'if (/d/ .. /h/) { next unless "m//"; print }'
  d
  e
  f
  g
  h

And then I pedantically did:

for my $c ( 'a'..'z') { next unless ($c =~ /[d-h]/); say $& if $&; }

and got:

  d
  e
  f
  g
  h

I guess I don't understand. Isn't 'd e f g h' what is expected? I've never really trusted one-liners. Brian Foy wrote an interesting piece on SO, to wit:

https://stackoverflow.com/questions/22652393/regex-1-variable-reset

except his example using //; did not work for me. He expected all vars to be cleared but when I ran his code:

# The regex capture variables are only reset on the next successful ma +tch. # This way, Perl saves a lot of time by not affecting variables when m +atches # fail. As such, only use those variables with a guard, to wit: # if ( /abc/ ) { # this tests for /abc/ success and now it's OK t +o use $& # ... # } # Here's an extended demonstration, with a special surprise at the end +: say "First long example...\n"; $_ = 'Hello Perl'; print "\$1: $1\n\$2: $2\n\$3: $3\n\$&: $&\n\n"; # successful match /(P)(erl)/; print "First match\n"; print "\$1: $1\n\$2: $2\n\$3: $3\n\$&: $&\n\n"; # unsuccessful match /(P)(ython)/; print "Failed capture\n"; print "\$1: $1\n\$2: $2\n\$3: $3\n\$&: $&\n\n"; # successful match again /(Pe)(r)(l)/; print "Three captures\n"; print "\$1: $1\n\$2: $2\n\$3: $3\n\$&: $&\n\n"; # successful match, fewer captures /(Perl)/; print "One capture\n"; print "\$1: $1\n\$2: $2\n\$3: $3\n\$&: $&\n\n"; # successful match, no captures /Perl/; print "No captures\n"; print "\$1: $1\n\$2: $2\n\$3: $3\n\$&: $&\n\n"; # successful match, no pattern, special case //; print "No nothing\n"; print "\$1: $1\n\$2: $2\n\$3: $3\n\$&: $&\n\n";

I got:

  $1: 
  $2: 
  $3: 
  $&: 

  First match
  $1: P
  $2: erl
  $3: 
  $&: Perl

  Failed capture
  $1: P
  $2: erl
  $3: 
  $&: Perl

  Three captures
  $1: Pe
  $2: r
  $3: l
  $&: Perl

  One capture
  $1: Perl
  $2: 
  $3: 
  $&: Perl

  No captures
  $1: 
  $2: 
  $3: 
  $&: Perl

  No nothing
  $1: 
  $2: 
  $3: 
  $&: Perl

As you can see in 'No nothing' $& was not cleared for me as it was for him, as he reported in that piece. I don't trust using $n, $`, $& or $' unless I explicitly test for TRUE after the regex executes. Am I being overly paranoid?

Replies are listed 'Best First'.
Re^2: Empty pattern in regex
by choroba (Cardinal) on Oct 19, 2023 at 12:46 UTC
    > Per that perlop discussion I wrapped // in quotes with m and tried:

    Which discussion? unless "m//" is the same as unless "1", it's just a string.

    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]

      I tried "m//" as reported on your 'next' expression and got 'd e f g h', as expected. From my perlop on 5.36.

          The empty pattern "//"
                  If the *PATTERN* evaluates to the empty string, the last
                  *successfully* matched regular expression is used instead. In
                  this case, only the "g" and "c" flags on the empty pattern are
                  honored; the other flags are taken from the original pattern. If
                  no match has previously succeeded, this will (silently) act
                  instead as a genuine empty pattern (which will always match).
      
                  Note that it's possible to confuse Perl into thinking "//" (the
                  empty regex) is really "//" (the defined-or operator). Perl is
                  usually pretty good about this, but some pathological cases
                  might trigger this, such as "$x///" (is that "($x) / (//)" or
                  "$x // /"?) and "print $fh //" ("print $fh(//" or
                  "print($fh //"?). In all of these examples, Perl will assume you
                  meant defined-or. If you meant the empty regex, just use
                  parentheses or spaces to disambiguate, or even prefix the empty
                  regex with an "m" (so "//" becomes "m//").
      
        so "//" becomes "m//"

        The double quotes are part of the text, not part of the code (as you can infer from them being already around the //).

        map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]