jeffatrobertsdotnet has asked for the wisdom of the Perl Monks concerning the following question:

I come seeking your insights and wisdom. I "understand" the following example but I can't "comprehend" it (or vise-versa)...
$_='aaab'; print "Matched 1=>$_\n" if /^[a]{2,}(?![b])+/; print "LastMatch1:$&\n"; print "Matched 2=>$_\n" if /^[a]{2,}/; print "LastMatch2:$&\n";
I expected (hoped) that the first RE would fail - but it does not. It seems the negative lookahead causes a "lazy" match on the 1st pattern. That is, $& is set to 'aa' as if it were non-greedy. If I change the {2,} quantifier to {3,} the pattern works as I expect. That is, the pattern match fails. If I change the pattern to {2,3} is behaves the same as {2,}.

The second RE (without the lookahead) behaves greedily as expected setting $& to 'aaa'.

Your insights will be greatly appreciated.

Replies are listed 'Best First'.
Re: Greedy match and zero-width negative lookahead assertion
by Eily (Monsignor) on Mar 16, 2018 at 16:16 UTC

    Greedy means the longest valid match. aaa isn't valid (because followed by b) so the longest match is aa (which is a series of 'a's not followed by b, exactly what you asked for). Adding a + to a quantifier, making it possessive, will prevent partial backtracking though (meaning if a given subpattern can match a longer string, it is the only string that it can accept). So /^a{2,}+(?!b)/ will work as expected. /^a{2,}(?![ab])/ is another solution.

      That's awesome! I guess I neither understood nor comprehended what I was doing... Thanks so much.
Re: Greedy match and zero-width negative lookahead assertion
by AnomalousMonk (Archbishop) on Mar 16, 2018 at 20:27 UTC

    (Side :) note also that the regex expression  (?![b])+ is problematic in the quoted match because it asks the regex engine to match a zero width assertion as many times as possible. Perl would have warned you about this if you had asked it to by enabling warnings:

    c:\@Work\Perl\monks>perl -wMstrict -le "$_ = 'aaab'; print qq{Matched 1 => '$_'} if /^[a]{2,}(?![b])+/; print qq{LastMatch1: '$&'}; " (?![b])+ matches null string many times in regex; marked by <-- HERE in m/^[a]{2,}(?![b])+ <-- HERE / at -e line 1. Matched 1 => 'aaab' LastMatch1: 'aa'

    Update: "Possessive" quantifiers were added with Perl version 5.10. Prior to that version and to date, the same functionality can be had with the independent or "atomic" subexpression  (?>pattern)


    Give a man a fish:  <%-{-{-{-<