chibiryuu has asked for the wisdom of the Perl Monks concerning the following question:

The following program prints "fail1, fail2":
$T = qr//; $U = qr//; $_ = "12"; /(1)(2)/; my ($t, $u) = ($1, $2); print $t =~ /$T/ ? "pass1, " : "fail1, "; print $u =~ /$U/ ? "pass2\n" : "fail2\n";
Either one of these changes:
$T = qr/./;
print $t =~ /(?:$T)/ ? "pass1, " : "fail1, ";
causes the program to print "pass1, pass2".

Replies are listed 'Best First'.
Re: Empty qr// fails to match -- Is this known bug?
by almut (Canon) on Oct 23, 2008 at 05:38 UTC

    It's no bug, but a feature. Quote from perlop:

    m/PATTERN/cgimosx

    (...) If the PATTERN evaluates to the empty string, the last successfully matched regular expression is used instead. In this case, only the "g" and "c" flags on the empty pattern is honoured - the other flags are taken from the original pattern. If no match has previously succeeded, this will (silently) act instead as a genuine empty pattern (which will always match).

    In this particular case, the last successfully matched regular expression is "(1)(2)". which matches neither "1" (i.e. $t), nor "2" ($u).

      No, I still say it is a bug. Especially since print qr// prints a non-empty string, (?-xism:), and /(?-xism:)/ is not interpretted as an empty regex. More to the point, this rather clearly violates the principle of "least surprise". It isn't even useful. I'd actually prefer that even /$empty_string/ not trigger the "empty regex" special behavior. But it seems very clear to me that /$empty_regex/ should not trigger this behavior.

      - tye        

        Right, the problem to me is that $U = qr/$some_user_provided_input/, and this is very unexpected behavior.

        I agree with tye and not almut: since "$U" ne "", /$U/ shouldn't be an empty pattern, so perlop's condition for empty // matching shouldn't apply.

        But after a bit of investigation, it seems that Perl always takes a shortcut: when $rx = qr/foo/, /$rx/ turns into /foo/. Which explains this behavior, but I still think it's a bug.

        The bug's in the stringification of qr//, if anywhere. The empty regex problem is (merely) an infelicity.

        I don't see why the stringified representation of a Regex would be given any importance here. To me, the whole point of compiled Regexes is that they are not string Scalars.

        Globals suck.

        Be well,
        rir

Re: Empty qr// fails to match -- Is this known bug?
by wokka (Acolyte) on Oct 23, 2008 at 16:24 UTC
    Check out "help perlop".

    In the case of an empty regex, perl will use the last successfully-matched regex. So, in your case, it's attempting to use /(1)(2)/ on both of those strings and failing. If you do the following:
    $T = qr//; $U = qr//; $_ = "12"; /(1)(2)/; my ($t, $u) = ($1, $2); $t="12"; print $t =~ /$T/ ? "pass1, " : "fail1, "; print $u =~ /$U/ ? "pass2\n" : "fail2\n";
    You will see:

    "pass1, fail2"

    This functionality is... questionable.
      I meant "perldoc perlop", but it doesn't matter, I somewhat misunderstood anyway. since qr// are quote operators, I think this makes perfect sense, given the way the feature is supposed to work. $T represents a compiled regex, but an empty one. When the interpreter sees that it's an empty regex, it attempts to use the last successfully matched one (as it's supposed to do).
      $T = qr//; $U = qr//; $_ = "12"; /(1)(2)/; my ($t, $u) = ($1, $2); print "t: $t\tu: $u\n"; print 'before $t gets reassigned: ' . ( ( $t =~ $T ) ? 'pass' : 'fail' ) . "\n"; $t="12"; print 'after $t gets reassigned: ' . ( ( $t =~ $T ) ? 'pass' : 'fail' ) . "\n"; print '$u never gets reassigned: ' . ( ( $u =~ $U ) ? 'pass' : 'fail' ) . "\n"; my $foo = 'bar'; print '$foo never got assigned a $1 or $2 or regex matched: ' . ( ( $foo =~ $T ) ? 'pass' : 'fail' ) . "\n"; $foo =~ /b/; print '$foo got matched against /b/: ' . ( ( $foo =~ $U ) ? 'pass' : 'fail' ) . "\n"

      This little mess produces:

      t: 1 u: 2 before $t gets reassigned: fail after $t gets reassigned: pass $u never gets reassigned: fail $foo never got assigned a $1 or $2 or regex matched: fail $foo got matched against /b/: pass