in reply to Re^2: Regex code block executes twice per match using look-arounds
in thread Regex code block executes twice per match using look-arounds

test look-behind, '}' so succeeds test look-ahead, '[' also succeeds code block encountered, execute advance pointer one place

That last step is wrong. It should be

advance pointer the length of the match

The length of the match is zero in the case where the lookahead and lookbehind is used. Since the pointer is not advanced, the regexp matches everything twice. Only a final check prevents the regexp from returning an identical match.

test look-behind, '}' so succeeds test look-ahead, '[' also succeeds code block encountered, execute same match? no, continue advance pointer the length of the match (0) test look-behind, '}' so succeeds test look-ahead, '[' also succeeds code block encountered, execute same match? yes, backtrack

More on this in Re: Regex code block executes twice per match using look-arounds.

Replies are listed 'Best First'.
Re^4: Regex code block executes twice per match using look-arounds
by johngg (Canon) on Jul 12, 2007 at 19:20 UTC
    advance pointer the length of the match

    That was the fundamental piece of the puzzle that I was missing. Indeed, after jettero's 2nd reply I started to investigate with use re 'debug'; and could see the mechanism you describe.

    I changed the regex so that it was using the look-behind but the look-ahead was replaced with a simple capture

    ... my $rxBetween = qr {(?x) (?<=($rxClose)) ($rxOpen) (?{print qq{Match @{ [++ $count] }: on left $1, on right $2\n}} +) }; ... $string =~ s{$rxBetween}{+$2}g; ...

    and that stopped the double execution. It also seemed clear from the debug output that using both look-arounds was making the engine do a lot more work.

    Thank you for your replies and the insights they have given.

    Cheers,

    JohnGG