in reply to Re^2: Oddity with Conditional Regex
in thread Oddity with Conditional Regex

It sure seems like a bug to me.

The regexp engine was massively changed between 5.8 and 5.10. Some bugs fell through. Some were found and fixed already. I don't know about this one. 5.10.1 is expected to be released before the end of the year.

On the plus side, both 5.8 and 5.10 compiled the regexp identically according to use re 'debug';. It's the matching that differs. In both version, it finds that (.*w) only matched once, backtracks. Then it seems to forget to forget that the entire (?(...)...) failed in 5.10.

As a workaround, replacing
(?=(.*w){2})
with
(?(?=(.*w){2})|(?!))
results in the desired behaviour.

By the way, I'd place the ^ before the (?()), and I'd use grouping parens ((?:...)) instead of capturing parens ((...)).

Replies are listed 'Best First'.
Re^4: Oddity with Conditional Regex (bug!)
by almut (Canon) on Nov 12, 2008 at 22:32 UTC
    It sure seems like a bug to me.

    Yes, I've reached the same conclusion after having played around with this some more.

    The issue can be reduced to the following simple test case:

    use re 'Debug' => "EXECUTE"; say "xb" =~ /^(?(?!a) (?=bb)).b$/x ? "yes":"no"; # -> yes (wrong +) say "xb" =~ /^(?(?!a) (?{}) (?=bb)).b$/x ? "yes":"no"; # -> no (corre +ct)

    The first case incorrectly matches. However, introducing a dummy eval (?{}) in the second case produces the correct result...

    The 'use re 'Debug' => "EXECUTE";' shows the execution differences (both versions compile to the same code — except for the eval, obviously):

    Guessing start of match in sv for REx "^(?(?!a) (?=bb)).b$" against "x +b" Guessed: match at offset 0 Matching REx "^(?(?!a) (?=bb)).b$" against "xb" 0 <> <xb> | 1:BOL(2) 0 <> <xb> | 2:LOGICAL[1](3) 0 <> <xb> | 3:UNLESSM[0](9) 0 <> <xb> | 5: EXACT <a>(7) failed... 0 <> <xb> | 9:IFTHEN(20) 0 <> <xb> | 11:IFMATCH[0](20) 0 <> <xb> | 13: EXACT <bb>(15) failed... 0 <> <xb> | 20:REG_ANY(21) <-- incorrectly cont +inues here 1 <x> <b> | 21:EXACT <b>(23) 2 <xb> <> | 23:EOL(24) 2 <xb> <> | 24:END(0) Match successful! yes Guessing start of match in sv for REx "^(?(?!a) (?{}) (?=bb)).b$" agai +nst "xb" Guessed: match at offset 0 Matching REx "^(?(?!a) (?{}) (?=bb)).b$" against "xb" 0 <> <xb> | 1:BOL(2) 0 <> <xb> | 2:LOGICAL[1](3) 0 <> <xb> | 3:UNLESSM[0](9) 0 <> <xb> | 5: EXACT <a>(7) failed... 0 <> <xb> | 9:IFTHEN(22) 0 <> <xb> | 11:EVAL(13) 0 <> <xb> | 13:IFMATCH[0](22) 0 <> <xb> | 15: EXACT <bb>(17) failed... failed... Match failed no

      I built Perl 5.10.x@34793 — first time building Perl! — which I believe is the latest Perl 5.10 as of Nov 10th

      It produces all yeses for the original snippet.
      It produces yes,no for the minimal snippet.

      Would you please submit a bug report?

        Would you please submit a bug report?

        Will do so...

        Update: ...or not :)  Just tried it with bleadperl@34830, and the problem appears to have been fixed already (at least, both the original snippet as well as my reduced version produce correct results). So, I suppose there's not much point in submitting another bugreport.  Or is there?