in reply to \G and regexes
shows you what the re engine is doing
The string bbbbabcabc (bbb bab cab c), there is no 3 letter string followed by abc.perl -Mre=debug -le"print $1 while q!bbbbabcabc! =~ /\G(\w\w\w)*?abc/g +" perl -Mre=debug -le"print $1 while q!bbbbabcabc! =~ /(\w\w\w)*?abc/g"
$ perl -Mre=debug -le"print $1 while q!bbbbabcabc! =~ /\G(\w\w\w)*?abc +/g" Compiling REx "\G(\w\w\w)*?abc" synthetic stclass "ANYOF[0-9A-Z_a-z][{unicode_all}]". Final program: 1: GPOS (2) 2: MINMOD (3) 3: CURLYM[1] {0,32767} (14) 7: ALNUM (8) 8: ALNUM (9) 9: ALNUM (12) 12: SUCCEED (0) 13: NOTHING (14) 14: EXACT <abc> (16) 16: END (0) floating "abc" at 0..2147483647 (checking floating) stclass ANYOF[0-9A +-Z_a-z][{unicode_all}] anchored(GPOS) GPOS:0 minlen 3 Matching REx "\G(\w\w\w)*?abc" against "bbbbabcabc" 0 <> <bbbbabcabc> | 1:GPOS(2) 0 <> <bbbbabcabc> | 2:MINMOD(3) 0 <> <bbbbabcabc> | 3:CURLYM[1] {0,32767}(14) CURLYM trying tail with matches=0... 0 <> <bbbbabcabc> | 7: ALNUM(8) 1 <b> <bbbabcabc> | 8: ALNUM(9) 2 <bb> <bbabcabc> | 9: ALNUM(12) 3 <bbb> <babcabc> | 12: SUCCEED(0) subpattern success... CURLYM now matched 1 times, len=3... CURLYM trying tail with matches=1... 3 <bbb> <babcabc> | 7: ALNUM(8) 4 <bbbb> <abcabc> | 8: ALNUM(9) 5 <bbbba> <bcabc> | 9: ALNUM(12) 6 <bbbbab> <cabc> | 12: SUCCEED(0) subpattern success... CURLYM now matched 2 times, len=3... CURLYM trying tail with matches=2... 6 <bbbbab> <cabc> | 7: ALNUM(8) 7 <bbbbabc> <abc> | 8: ALNUM(9) 8 <bbbbabca> <bc> | 9: ALNUM(12) 9 <bbbbabcab> <c> | 12: SUCCEED(0) subpattern success... CURLYM now matched 3 times, len=3... CURLYM trying tail with matches=3... 9 <bbbbabcab> <c> | 7: ALNUM(8) 10 <bbbbabcabc> <> | 8: ALNUM(9) failed... failed... Match failed Freeing REx: "\G(\w\w\w)*?abc"
$ perl -Mre=debug -le"print $1 while q!bbbbabcabc! =~ /(\w\w\w)*?abc/g +" Compiling REx "(\w\w\w)*?abc" synthetic stclass "ANYOF[0-9A-Z_a-z][{unicode_all}]". Final program: 1: MINMOD (2) 2: CURLYM[1] {0,32767} (13) 6: ALNUM (7) 7: ALNUM (8) 8: ALNUM (11) 11: SUCCEED (0) 12: NOTHING (13) 13: EXACT <abc> (15) 15: END (0) floating "abc" at 0..2147483647 (checking floating) stclass ANYOF[0-9A +-Z_a-z][{unicode_all}] minlen 3 Guessing start of match in sv for REx "(\w\w\w)*?abc" against "bbbbabc +abc" Found floating substr "abc" at offset 4... start_shift: 0 check_at: 4 s: 0 endpos: 5 Does not contradict STCLASS... Guessed: match at offset 0 Matching REx "(\w\w\w)*?abc" against "bbbbabcabc" Matching stclass ANYOF[0-9A-Z_a-z][{unicode_all}] against "bbbbabca" ( +8 chars) 0 <> <bbbbabcabc> | 1:MINMOD(2) 0 <> <bbbbabcabc> | 2:CURLYM[1] {0,32767}(13) CURLYM trying tail with matches=0... 0 <> <bbbbabcabc> | 6: ALNUM(7) 1 <b> <bbbabcabc> | 7: ALNUM(8) 2 <bb> <bbabcabc> | 8: ALNUM(11) 3 <bbb> <babcabc> | 11: SUCCEED(0) subpattern success... CURLYM now matched 1 times, len=3... CURLYM trying tail with matches=1... 3 <bbb> <babcabc> | 6: ALNUM(7) 4 <bbbb> <abcabc> | 7: ALNUM(8) 5 <bbbba> <bcabc> | 8: ALNUM(11) 6 <bbbbab> <cabc> | 11: SUCCEED(0) subpattern success... CURLYM now matched 2 times, len=3... CURLYM trying tail with matches=2... 6 <bbbbab> <cabc> | 6: ALNUM(7) 7 <bbbbabc> <abc> | 7: ALNUM(8) 8 <bbbbabca> <bc> | 8: ALNUM(11) 9 <bbbbabcab> <c> | 11: SUCCEED(0) subpattern success... CURLYM now matched 3 times, len=3... CURLYM trying tail with matches=3... 9 <bbbbabcab> <c> | 6: ALNUM(7) 10 <bbbbabcabc> <> | 7: ALNUM(8) failed... failed... 1 <b> <bbbabcabc> | 1:MINMOD(2) 1 <b> <bbbabcabc> | 2:CURLYM[1] {0,32767}(13) CURLYM trying tail with matches=0... 1 <b> <bbbabcabc> | 6: ALNUM(7) 2 <bb> <bbabcabc> | 7: ALNUM(8) 3 <bbb> <babcabc> | 8: ALNUM(11) 4 <bbbb> <abcabc> | 11: SUCCEED(0) subpattern success... CURLYM now matched 1 times, len=3... CURLYM trying tail with matches=1... 4 <bbbb> <abcabc> | 13: EXACT <abc>(15) 7 <bbbbabc> <abc> | 15: END(0) Match successful! bbb Guessing start of match in sv for REx "(\w\w\w)*?abc" against "abc" Found floating substr "abc" at offset 0... start_shift: 0 check_at: 7 s: 7 endpos: 8 Does not contradict STCLASS... Guessed: match at offset 0 Matching REx "(\w\w\w)*?abc" against "abc" Matching stclass ANYOF[0-9A-Z_a-z][{unicode_all}] against "a" (1 chars +) 7 <bbbbabc> <abc> | 1:MINMOD(2) 7 <bbbbabc> <abc> | 2:CURLYM[1] {0,32767}(13) CURLYM trying tail with matches=0... 7 <bbbbabc> <abc> | 13: EXACT <abc>(15) 10 <bbbbabcabc> <> | 15: END(0) Match successful! Freeing REx: "(\w\w\w)*?abc"
|
|---|