Athanasius has asked for the wisdom of the Perl Monks concerning the following question:

I’ve just today discovered that, inside a regex, $_ takes on the value of the string being matched against:

16:12 >perl -wE "my $string = 'abcde'; $_ = 42; say; $string =~ /(?{ s +ay })/;" 42 abcde 16:12 >perl -v This is perl 5, version 22, subversion 0 (v5.22.0) built for MSWin32-x +64-multi-thread

This brings up two questions. First, is it documented anywhere? It’s not mentioned in the perlvar entry for $ARG.

Second, what is the rationale for this behaviour?

Here’s the background:

Following on from the recent thread Shorthand for /^\s*abc\s*$/ ?, I was experimenting with (??{ }) as an alternative means of parameterizing a regex without using a sub. This works:

use strict; use warnings; my %searches = ( bar => " bar \n\n", bar1 => " bar1 \nw\n", baz => "baz", foo => " foo c \n", foo1 => " foo ", ); my $x; my $regex = qr{ ^ \s* (??{ $x }) \s* $ }x; for (sort keys %searches) { $x = $_; print "$_: ", $searches{$_} =~ $regex ? 'match' : 'no match', "\n" +; }

but the use of $x is clumsy. So I tried $_ instead:

my $regex = qr{ ^ \s* (??{ $_ }) \s* $ }x;

After some debugging using perl -Mre=debug ..., it finally dawned on me that within the regex, $_ no longer contains the current hash key, but has been reset to the current hash value (i.e., the string on the left of the =~) with whitespace removed.1

Which seems strange to me, as well as unnecessary. Is there a situation in which this implicit assignment is actually useful?

1Update: Sorry, it’s (??{ $_ }) that has whitespace removed, due to the /x modifier.

Thanks,

Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Replies are listed 'Best First'.
Re: Value of $_ inside a regex
by Discipulus (Canon) on Oct 12, 2015 at 07:53 UTC
    Morning Athanasius and compliments for your discover: dive into Perl seems a neverendig story..

    My little contribution: i searched in perlretut and I read:
    Normally, regexps are a part of Perl expressions. Code evaluation expressions turn that around by allowing arbitrary Perl code to be a part of a regexp. A code evaluation expression is denoted (?{code}), with code a string of Perl statements. Be warned that this feature is considered experimental, and may be changed without notice.
    Maybe this is the right place to include some words about what you observed. But the behaviour is not a concern of $_ but instead a concern of the (?{code}) construct. In the other side, first words about $_ are:
    The default input and pattern-searching space.
    From this point of view the behaviour of $_ is the same i observed in every block of Perl code: id est 'The current thing' or, as we speak latin, the current 'quidquid'. It seems the the current thing inside a regex is the target string, not so bad. This generally speaking; for the key-value question i need to understand it well..

    L*

    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
      Aha, so we're talking about a specific construct, the http://perldoc.perl.org/perlre.html#%28?{-code-}%29, which says
      inside a (?{...}) block, $_ refers to the string the regular expression is matching against. You can also use pos() to know what is the current position of matching within this string.
        so that part is documented; good to know.

        L*
        There are no rules, there are no thumbs..
        Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.

        I see this statement in the documentation of  (?{...}) in perlre back to 5.8.9. But nothing like it for  (??{...}) Interesting.


        Give a man a fish:  <%-{-{-{-<

Re: Value of $_ inside a regex
by AnomalousMonk (Archbishop) on Oct 12, 2015 at 07:24 UTC

    Just for support, the following behaves the same under ActiveState 5.8.9 and Strawberry 5.14.4:

    c:\@Work\Perl>perl -wMstrict -le "print qq{perl version $]}; ;; my %h = qw(foo ZOT); ;; $_ = 'foo'; ;; print qq{before rx: '$_'}; my $rx = qr{ \A \s* (??{ print qq{in rx: '$_'}; $_; }) }xms; ;; print qq{before match: '$_'}; print 'match' if $h{'foo'} =~ $rx; print qq{after match; '$_'}; " perl version 5.008009 before rx: 'foo' before match: 'foo' in rx: 'ZOT' match after match; 'foo'
    And no, I've never encountered this behavior, either (but neither have I looked for it).


    Give a man a fish:  <%-{-{-{-<

Re: Value of $_ inside a regex
by shmem (Chancellor) on Oct 12, 2015 at 13:17 UTC

    Just to complete the above: (?{}) sets $_ to the LHS side also in the substitution part of s///e, even if the code is a no-op:

    qwurx [shmem] ~> perl -le '$s = "foo"; $_="bar"; $s =~ s/./print/e' bar qwurx [shmem] ~> perl -le '$s = "foo"; $_="bar"; $s =~ s/.(?{print})/p +rint/e' foo foo qwurx [shmem] ~> perl -le '$s = "foo"; $_="bar"; $s =~ s/.(?{})/print/ +e' foo
    perl -le'print map{pack c,($-++?1:13)+ord}split//,ESEL'
Re: Value of $_ inside a regex
by stevieb (Canon) on Oct 12, 2015 at 16:07 UTC

    I don't know if this adds anything valuable to this thread, but I've played around quite a bit with code eval expressions. Here's a trick that allows you to do modification without using substitution:

    perl -wMstrict -E 'for(1..3){/2(?{$_ += 10})/;say}' 1 12 3