advance pointer the length of the match

That was the fundamental piece of the puzzle that I was missing. Indeed, after jettero's 2nd reply I started to investigate with use re 'debug'; and could see the mechanism you describe.

I changed the regex so that it was using the look-behind but the look-ahead was replaced with a simple capture

... my $rxBetween = qr {(?x) (?<=($rxClose)) ($rxOpen) (?{print qq{Match @{ [++ $count] }: on left $1, on right $2\n}} +) }; ... $string =~ s{$rxBetween}{+$2}g; ...

and that stopped the double execution. It also seemed clear from the debug output that using both look-arounds was making the engine do a lot more work.

Using look-behind and look-ahead

Compiling REx `[[<{(]' size 12 Got 100 bytes for offset annotations. first at 1 1: ANYOF[(<[{](12) 12: END(0) stclass `ANYOF[(<[{]' minlen 1 Offsets: [12] 1[6] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 7[0] Compiling REx `[]>})]' size 12 Got 100 bytes for offset annotations. first at 1 1: ANYOF[)>\]}](12) 12: END(0) stclass `ANYOF[)>\]}]' minlen 1 Offsets: [12] 1[6] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 7[0] Compiling REx `(?x) (?<=((?-xism:[]>})]))) (?=((?-xism:[[<{(]))) (?{print qq{Match @{ [++ $count] }: on left $1, on right $2\n}} +) ' size 41 Got 332 bytes for offset annotations. first at 1 1: IFMATCH[-1](20) 3: OPEN1(5) 5: ANYOF[)>\]}](16) 16: CLOSE1(18) 18: SUCCEED(0) 19: TAIL(20) 20: IFMATCH[-0](39) 22: OPEN2(24) 24: ANYOF[(<[{](35) 35: CLOSE2(37) 37: SUCCEED(0) 38: TAIL(39) 39: EVAL(41) 41: END(0) minlen 0 with eval Offsets: [41] 17[17] 0[0] 17[1] 0[0] 26[6] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0 +] 0[0] 0[0] 0[0] 33[1] 0[0] 33[0] 33[0] 46[17] 0[0] 46[1] 0[0] 55[6] +0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 62[1] 0[0] 62[0] 62 +[0] 72[68] 0[0] 140[0] String: <x1>[x2]{x3}(x4) ---------------------------------------- Matching REx `(?x) (?<=((?-xism:[]>})]))) (?=((?-xism:[[<{(]))) (?{print qq{Match @{ [++ $count] }: on left $1, on right $2\n}} +) ...' against `<x1>[x2]{x3}(x4)' Setting an EVAL scope, savestack=25 0 <> <<x1>[x2]{x3}> | 1: IFMATCH[-1] failed... Setting an EVAL scope, savestack=25 1 <<> <x1>[x2]{x3}> | 1: IFMATCH[-1] 0 <> <<x1>[x2]{x3}> | 3: OPEN1 0 <> <<x1>[x2]{x3}> | 5: ANYOF[)>\]}] failed... failed... Setting an EVAL scope, savestack=25 2 <<x> <1>[x2]{x3}> | 1: IFMATCH[-1] 1 <<> <x1>[x2]{x3}> | 3: OPEN1 1 <<> <x1>[x2]{x3}> | 5: ANYOF[)>\]}] failed... failed... Setting an EVAL scope, savestack=25 3 <<x1> <>[x2]{x3}> | 1: IFMATCH[-1] 2 <<x> <1>[x2]{x3}> | 3: OPEN1 2 <<x> <1>[x2]{x3}> | 5: ANYOF[)>\]}] failed... failed... Setting an EVAL scope, savestack=25 4 <<x1>> <[x2]{x3}> | 1: IFMATCH[-1] 3 <<x1> <>[x2]{x3}> | 3: OPEN1 3 <<x1> <>[x2]{x3}> | 5: ANYOF[)>\]}] 4 <<x1>> <[x2]{x3}> | 16: CLOSE1 4 <<x1>> <[x2]{x3}> | 18: SUCCEED could match... 4 <<x1>> <[x2]{x3}> | 20: IFMATCH[-0] 4 <<x1>> <[x2]{x3}> | 22: OPEN2 4 <<x1>> <[x2]{x3}> | 24: ANYOF[(<[{] 5 <<x1>[> <x2]{x3}> | 35: CLOSE2 5 <<x1>[> <x2]{x3}> | 37: SUCCEED could match... 4 <<x1>> <[x2]{x3}> | 39: EVAL re_eval 0x63a28 Match 1: on left >, on right [ 4 <<x1>> <[x2]{x3}> | 41: END Match successful! Matching REx `(?x) (?<=((?-xism:[]>})]))) (?=((?-xism:[[<{(]))) (?{print qq{Match @{ [++ $count] }: on left $1, on right $2\n}} +) ...' against `[x2]{x3}(x4)' Setting an EVAL scope, savestack=37 4 <<x1>> <[x2]{x3}> | 1: IFMATCH[-1] 3 <<x1> <>[x2]{x3}> | 3: OPEN1 3 <<x1> <>[x2]{x3}> | 5: ANYOF[)>\]}] 4 <<x1>> <[x2]{x3}> | 16: CLOSE1 4 <<x1>> <[x2]{x3}> | 18: SUCCEED could match... 4 <<x1>> <[x2]{x3}> | 20: IFMATCH[-0] 4 <<x1>> <[x2]{x3}> | 22: OPEN2 4 <<x1>> <[x2]{x3}> | 24: ANYOF[(<[{] 5 <<x1>[> <x2]{x3}> | 35: CLOSE2 5 <<x1>[> <x2]{x3}> | 37: SUCCEED could match... 4 <<x1>> <[x2]{x3}> | 39: EVAL re_eval 0x63a28 Match 2: on left >, on right [ 4 <<x1>> <[x2]{x3}> | 41: END Match possible, but length=0 is smaller than requested=1, failing! Clearing an EVAL scope, savestack=37..40 Setting an EVAL scope, savestack=37 5 <<x1>[> <x2]{x3}> | 1: IFMATCH[-1] 4 <<x1>> <[x2]{x3}> | 3: OPEN1 4 <<x1>> <[x2]{x3}> | 5: ANYOF[)>\]}] failed... ...

Using look-behind and capture

Compiling REx `[[<{(]' size 12 Got 100 bytes for offset annotations. first at 1 1: ANYOF[(<[{](12) 12: END(0) stclass `ANYOF[(<[{]' minlen 1 Offsets: [12] 1[6] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 7[0] Compiling REx `[]>})]' size 12 Got 100 bytes for offset annotations. first at 1 1: ANYOF[)>\]}](12) 12: END(0) stclass `ANYOF[)>\]}]' minlen 1 Offsets: [12] 1[6] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 7[0] Compiling REx `(?x) (?<=((?-xism:[]>})]))) ((?-xism:[[<{(])) (?{print qq{Match @{ [++ $count] }: on left $1, on right $2\n}} +) ' size 37 Got 300 bytes for offset annotations. first at 1 synthetic stclass `ANYOF[(<[{]'. 1: IFMATCH[-1](20) 3: OPEN1(5) 5: ANYOF[)>\]}](16) 16: CLOSE1(18) 18: SUCCEED(0) 19: TAIL(20) 20: OPEN2(22) 22: ANYOF[(<[{](33) 33: CLOSE2(35) 35: EVAL(37) 37: END(0) stclass `ANYOF[(<[{]' minlen 1 with eval Offsets: [37] 17[17] 0[0] 17[1] 0[0] 26[6] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0 +] 0[0] 0[0] 0[0] 33[1] 0[0] 33[0] 33[0] 43[1] 0[0] 52[6] 0[0] 0[0] 0[ +0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 59[1] 0[0] 68[68] 0[0] 136[0] String: <x1>[x2]{x3}(x4) ---------------------------------------- Matching REx `(?x) (?<=((?-xism:[]>})]))) ((?-xism:[[<{(])) (?{print qq{Match @{ [++ $count] }: on left $1, on right $2\n}} +) ...' against `<x1>[x2]{x3}(x4)' Matching stclass `ANYOF[(<[{]' against `<x1>[x2]{x3}(x4)' Setting an EVAL scope, savestack=25 0 <> <<x1>[x2]{x3}> | 1: IFMATCH[-1] failed... Setting an EVAL scope, savestack=25 4 <<x1>> <[x2]{x3}> | 1: IFMATCH[-1] 3 <<x1> <>[x2]{x3}> | 3: OPEN1 3 <<x1> <>[x2]{x3}> | 5: ANYOF[)>\]}] 4 <<x1>> <[x2]{x3}> | 16: CLOSE1 4 <<x1>> <[x2]{x3}> | 18: SUCCEED could match... 4 <<x1>> <[x2]{x3}> | 20: OPEN2 4 <<x1>> <[x2]{x3}> | 22: ANYOF[(<[{] 5 <<x1>[> <x2]{x3}> | 33: CLOSE2 5 <<x1>[> <x2]{x3}> | 35: EVAL re_eval 0x63a28 Match 1: on left >, on right [ 5 <<x1>[> <x2]{x3}> | 37: END Match successful! Matching REx `(?x) (?<=((?-xism:[]>})]))) ((?-xism:[[<{(])) (?{print qq{Match @{ [++ $count] }: on left $1, on right $2\n}} +) ...' against `x2]{x3}(x4)' ...

Thank you for your replies and the insights they have given.

Cheers,

JohnGG


In reply to Re^4: Regex code block executes twice per match using look-arounds by johngg
in thread Regex code block executes twice per match using look-arounds by johngg

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.