in reply to Re: Variable-Width Lookbehind (hacked via recursion)
in thread Variable-Width Lookbehind (hacked via recursion)

Excellent point, thank you for spotting that! I can confirm that the (?= ) around (?<lookback> ) can be removed in all cases in the root node (since (?<= ) is already zero-width). Makes the regexes even shorter! :-)

It's probably a vestige from the negative case like here or in the following, where the (?! (?<lookback> ... ) ) is needed*.

# Match any /\d./ that is *not* preceded by an /a/ my $re5 = qr{ (?! (?<lookback> (?<= a | (?=(?&lookback)) . ) ) ) (?<target> \d . ) }msx; my $re5_short = qr /(?!((?<= a |(?=(?-1)).))) (\d.) /sx; for my $regex ($re5,$re5_short) { unlike "fo", $regex; unlike "x5", $regex; unlike "ab5 x4", $regex; like "5ab", $regex; like "x5 ab5", $regex; like "x5 ab5 x2", $regex; my @results; while ("x2 4x3a55aaa1" =~ /$regex/g) { push @results, $+{target} // $2 } is_deeply \@results, ["2 ","4x","3a"]; }

* Update: Hmm, actually, it turns out this seems to work too... (although putting the exact explanation of why into words is eluding me at the moment...)

my $re5 = qr{ (?<lookback> (?<! a | (?!(?&lookback)) . ) ) (?<target> \d . ) }msx; my $re5_short = qr /((?<! a |(?!(?-1)).)) (\d.) /sx;

Replies are listed 'Best First'.
Re^3: Variable-Width Lookbehind (hacked via recursion)
by haukex (Archbishop) on Oct 30, 2017 at 17:20 UTC
    putting the explanation of why into words...

    So the two key things to note are:

    • The pattern (?<!X) (for any character X) matches at the beginning of the string (because there is no preceding character), and
    • the double negation of (?<! (?! ) ) means that whatever the inner call to (?&lookback) returns (match/no match) is what the outer (?<lookback> ) will return. So what the last, innermost (furthest left) lookback returns is what the whole, outermost lookback will return.

    So for the regex in question it boils down to two cases:

    • If there is no preceding "a", then the regex will recurse all the way to the beginning of the string, where lookback will match.
    • If there is a preceding "a", then (?<!a) will cause the match to fail.

    Minor edit for clarification.