in reply to Re: Non-greedy match end of line bug?
in thread Non-greedy match end of line bug?

Well... I would buy it "by default" or with /m qualifier, but I did add a /s qualifier. And with that I think \n is supposed to be treated just like any other character. Should it not?
s Treat string as single line. That is, change "." to match any character whatsoever, even a newline, which normally it would not match.
  • Comment on Re^2: Non-greedy match end of line bug?

Replies are listed 'Best First'.
Re^3: Non-greedy match end of line bug?
by kcott (Archbishop) on Oct 26, 2021 at 02:12 UTC

    You're not taking into account the non-greediness. To accommodate matching the $ (which is before the newline) $1 holds 'foo'. If you also want to match the terminal newline, use \z instead of $:

    $ perl -e '$_="foo\n"; print "1=|$1|\n" if m/(fo.+?)$/s' 1=|foo| $ perl -e '$_="foo\n"; print "1=|$1|\n" if m/(fo.+?)\z/s' 1=|foo |

    Your comparison with the first one-liner is not comparing apples with apples:

    $ perl -e '$_="foo\nbar"; print "1=|$1|\n" if m/(fo.+?)$/s' 1=|foo bar| $ perl -e '$_="foo\nbar\n"; print "1=|$1|\n" if m/(fo.+?)$/s' 1=|foo bar| $ perl -e '$_="foo\nbar\n"; print "1=|$1|\n" if m/(fo.+?)\z/s' 1=|foo bar |

    I also second ++Fletch's recommendation to use Regexp::Debugger. This allows you to step through the matching process and see exactly what's happening. I often use it myself.

    — Ken

      I get what's going on now, thank you.

      I still think it's a bug, or at the very least a major implementation quirk that is incompatible with other regex implementations. Javascript and Golang treat /s the intuitive way and don't make an exception for \n at the end of a string.

      Type this into any browser console:

      "foo\nbar".match(/(fo.+?)$/s) && RegExp.$1 "foo\n".match(/(fo.+?)$/s) && RegExp.$1

      Or try it on regex101.com - you get different matching results on PCRE vs non-PCRE based engines.

      I think it warrants a big warning in perlre. It was a nasty surprise for me even though I am far from being a perl novice.

        > I still think it's a bug, or at the very least a major implementation quirk that is incompatible with other regex implementations.

        Well JS claimed from the very beginning on to re-implement the Perl4 regex features.

        So if anything, then it's JS which is buggy.

        Furthermore, since when does JS support the /s flag? I can see that it works in FF now, but I can't find it documented in MDN !?!

        Not long ago JS required a "weird" character class like [^] to also match newlines, alike . with /s flag in Perl.

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        Wikisyntax for the Monastery