in reply to Re^2: Non-greedy match end of line bug?
in thread Non-greedy match end of line bug?

You're not taking into account the non-greediness. To accommodate matching the $ (which is before the newline) $1 holds 'foo'. If you also want to match the terminal newline, use \z instead of $:

$ perl -e '$_="foo\n"; print "1=|$1|\n" if m/(fo.+?)$/s' 1=|foo| $ perl -e '$_="foo\n"; print "1=|$1|\n" if m/(fo.+?)\z/s' 1=|foo |

Your comparison with the first one-liner is not comparing apples with apples:

$ perl -e '$_="foo\nbar"; print "1=|$1|\n" if m/(fo.+?)$/s' 1=|foo bar| $ perl -e '$_="foo\nbar\n"; print "1=|$1|\n" if m/(fo.+?)$/s' 1=|foo bar| $ perl -e '$_="foo\nbar\n"; print "1=|$1|\n" if m/(fo.+?)\z/s' 1=|foo bar |

I also second ++Fletch's recommendation to use Regexp::Debugger. This allows you to step through the matching process and see exactly what's happening. I often use it myself.

— Ken

Replies are listed 'Best First'.
Re^4: Non-greedy match end of line bug?
by am12345 (Novice) on Oct 26, 2021 at 20:35 UTC

    I get what's going on now, thank you.

    I still think it's a bug, or at the very least a major implementation quirk that is incompatible with other regex implementations. Javascript and Golang treat /s the intuitive way and don't make an exception for \n at the end of a string.

    Type this into any browser console:

    "foo\nbar".match(/(fo.+?)$/s) && RegExp.$1 "foo\n".match(/(fo.+?)$/s) && RegExp.$1

    Or try it on regex101.com - you get different matching results on PCRE vs non-PCRE based engines.

    I think it warrants a big warning in perlre. It was a nasty surprise for me even though I am far from being a perl novice.

      > I still think it's a bug, or at the very least a major implementation quirk that is incompatible with other regex implementations.

      Well JS claimed from the very beginning on to re-implement the Perl4 regex features.

      So if anything, then it's JS which is buggy.

      Furthermore, since when does JS support the /s flag? I can see that it works in FF now, but I can't find it documented in MDN !?!

      Not long ago JS required a "weird" character class like [^] to also match newlines, alike . with /s flag in Perl.

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery

        It's here in the summary table. Specifically /s / dotAll

        The cake is a lie.
        The cake is a lie.
        The cake is a lie.