in reply to Re^3: Regex Optional capture doesn't
in thread Regex Optional capture doesn't

Hmmm

> The fix doesn't work because the regex has to decide which non-greedy ? has "precedence".

strange, it works like I initially thought

DB<63> print join "|",'abc' =~ /a(.+?)(c)?/ b|c DB<64> print join "|",'abd' =~ /a(.+?)(c)?/ b|

well you seem to have other issues I can't see.

EDIT

Interesting

DB<84> print join "|",'a<c1<c2' =~ /(.+?)<(c.)?/ # ok a|c1 DB<85> print join "|",'a<b1<b2' =~ /(.+?)<(c.)?/ # ok a| DB<86> print join "|",'a<b1<c2' =~ /(.+?)<(c.)?/ # oops a|

you might want to use use re 'debug' to parse whats happening.

update

ah now it's clearer

DB<89> print join "|",'a<b1<c2' =~ /(.+?)(c.)?/ a|

The non-greedy is finishing as soon as it succeeds, and c. is optional

Cheers Rolf
(addicted to the Perl Programming Language and ☆☆☆☆ :)
Je suis Charlie!

Replies are listed 'Best First'.
Re^5: Regex Optional capture doesn't (updated)
by NetWallah (Canon) on Oct 05, 2017 at 17:40 UTC
    Ok - your explanation makes sense... Thanks (++).

    I'll wait to see of other monks can find a way to do what I need with a single regex.

                    All power corrupts, but we need electricity.

      > can find a way to do what I need with a single regex.

      "single regex" is relative in Perl since you can include Perl-code investigating a sub-match

      Does it must be a single regex, or is it just academic curiosity?

      edit

      you need a positive condition which allows to stop the non-greedy only if you have a match or you reached the end.

      DB<113> print join "|",'a<b1<b2' =~ /(a).+?(?:(c.)|$)/ a| DB<114> print join "|",'a<b1<c2' =~ /(a).+?(?:(c.)|$)/ a|c2 DB<115> print join "|",'a<c1<c2' =~ /(a).+?(?:(c.)|$)/ a|c1

      Try to apply this technique.

      HTH

      Cheers Rolf
      (addicted to the Perl Programming Language and ☆☆☆☆ :)
      Je suis Charlie!

        More like a perverse obsession.

        Thanks for your tip on "use re 'debug'".

        It shows that the RE gets confounded by the <Unwanted> tag.
        Removing that tag allows the regex to capture both items I want.

        However, this is not an option while I'm parsing the log.

                        All power corrupts, but we need electricity.