in reply to Oddity with Conditional Regex

If all the (?{ }) parts are removed, one of the clauses is empty. And since an empty string always matches, your answer is always "YES".

Replies are listed 'Best First'.
Re^2: Oddity with Conditional Regex
by almut (Canon) on Nov 12, 2008 at 13:57 UTC

    I suppose the idea was to remove the empty else clause together with the debugging prints...  Anyhow, when doing so, i.e. with this test

    if(/ (?(?=(.*o){1}) (?(?!(.*r){1}) (?=(.*w){2}) ) | (?=(.*w){2}) (?=(.*r){1}) ) ^..W$ /xi)

    I can confirm that 5.8.8 and 5.10.0 are in fact producing different results: 5.8.8 gives YES/NO/NO/NO/YES/NO/NO/YES, while with 5.10.0 everything matches.

      It sure seems like a bug to me.

      The regexp engine was massively changed between 5.8 and 5.10. Some bugs fell through. Some were found and fixed already. I don't know about this one. 5.10.1 is expected to be released before the end of the year.

      On the plus side, both 5.8 and 5.10 compiled the regexp identically according to use re 'debug';. It's the matching that differs. In both version, it finds that (.*w) only matched once, backtracks. Then it seems to forget to forget that the entire (?(...)...) failed in 5.10.

      As a workaround, replacing
      (?=(.*w){2})
      with
      (?(?=(.*w){2})|(?!))
      results in the desired behaviour.

      By the way, I'd place the ^ before the (?()), and I'd use grouping parens ((?:...)) instead of capturing parens ((...)).

        It sure seems like a bug to me.

        Yes, I've reached the same conclusion after having played around with this some more.

        The issue can be reduced to the following simple test case:

        use re 'Debug' => "EXECUTE"; say "xb" =~ /^(?(?!a) (?=bb)).b$/x ? "yes":"no"; # -> yes (wrong +) say "xb" =~ /^(?(?!a) (?{}) (?=bb)).b$/x ? "yes":"no"; # -> no (corre +ct)

        The first case incorrectly matches. However, introducing a dummy eval (?{}) in the second case produces the correct result...

        The 'use re 'Debug' => "EXECUTE";' shows the execution differences (both versions compile to the same code — except for the eval, obviously):

        Guessing start of match in sv for REx "^(?(?!a) (?=bb)).b$" against "x +b" Guessed: match at offset 0 Matching REx "^(?(?!a) (?=bb)).b$" against "xb" 0 <> <xb> | 1:BOL(2) 0 <> <xb> | 2:LOGICAL[1](3) 0 <> <xb> | 3:UNLESSM[0](9) 0 <> <xb> | 5: EXACT <a>(7) failed... 0 <> <xb> | 9:IFTHEN(20) 0 <> <xb> | 11:IFMATCH[0](20) 0 <> <xb> | 13: EXACT <bb>(15) failed... 0 <> <xb> | 20:REG_ANY(21) <-- incorrectly cont +inues here 1 <x> <b> | 21:EXACT <b>(23) 2 <xb> <> | 23:EOL(24) 2 <xb> <> | 24:END(0) Match successful! yes Guessing start of match in sv for REx "^(?(?!a) (?{}) (?=bb)).b$" agai +nst "xb" Guessed: match at offset 0 Matching REx "^(?(?!a) (?{}) (?=bb)).b$" against "xb" 0 <> <xb> | 1:BOL(2) 0 <> <xb> | 2:LOGICAL[1](3) 0 <> <xb> | 3:UNLESSM[0](9) 0 <> <xb> | 5: EXACT <a>(7) failed... 0 <> <xb> | 9:IFTHEN(22) 0 <> <xb> | 11:EVAL(13) 0 <> <xb> | 13:IFMATCH[0](22) 0 <> <xb> | 15: EXACT <bb>(17) failed... failed... Match failed no