Here's another way to look at things: instrument the regex with (?{ code }) (see Extended Patterns) print points to learn by experimentation. I'm also taking the liberty of introducing some other new constructs: the (?:pattern) non-capturing grouping (also see Extended Patterns); the /x regex modifier (all the preceding links found in perlre); and the @- (aka @LAST_MATCH_START) array regex special variable (see perlvar).
First look at TGA matching against a simplified string without a \G anchor. Note that in contrast to some other code examples in this thread, the beginning offset of a match is reported.
After the successful TGA match at offsets 6 thru 8, the regex engine starts trying to match again at offset 9. The RE tries matches at offsets 9, 10 and 11 and finds a spurious (because it's not on a base-triplet boundary) match at offset 11-13. (I'm not sure why the RE doesn't try matching from offset 14 onward.)c:\@Work\Perl>perl -wMstrict -le "my $s = 'XXXxxxTGAxxTGAxxxxxxx'; while ($s =~ m{ (?{ print qq{trying a match at offset }, pos $s }) (? +: \w\w\w)*? (TGA) }xmsg) { print qq{matched TGA beginning at offset $-[1]}; } " trying a match at offset 0 matched TGA beginning at offset 6 trying a match at offset 9 trying a match at offset 10 trying a match at offset 11 matched TGA beginning at offset 11
Now consider the effect of adding a \G anchor assertion.
Now the RE can only begin another successful match at the offset immediately beyond the point at which the previous successful match ended, offset 9; it cannot try offsets 10 or 11 or any other because they do not satisfy the \G assertion.c:\@Work\Perl>perl -wMstrict -le "my $s = 'XXXxxxTGAxxTGAxxxxxxx'; while ($s =~ m{ \G (?{ print qq{trying a match at offset }, pos $s }) + (?: \w\w\w)*? (TGA) }xmsg) { print qq{matched TGA beginning at offset $-[1]}; } " trying a match at offset 0 matched TGA beginning at offset 6 trying a match at offset 9
Supplemental: We just got finished saying that in
the RE will match the TGA at offset 11 because it's not constrained by a \G assertion. So inmy $s = 'XXXxxxTGAxxTGAxxxxxxx'; while ($s =~ m{ (?{ print qq{trying a match at offset }, pos $s }) (?: + \w\w\w)*? (TGA) }xmsg) { print qq{matched TGA beginning at offset $-[1]}; }
(still no \G), why does the RE miss the TGA at offset 11 when there is another TGA present at offset 18 (which it does match)?c:\@Work\Perl>perl -wMstrict -le "my $s = 'XXXxxxTGAxxTGAxxxxTGAxx'; while ($s =~ m{ (?{ print qq{trying a match at offset }, pos $s }) (? +: \w\w\w)*? (TGA) }xmsg) { print qq{matched TGA beginning at offset $-[1]}; } " trying a match at offset 0 matched TGA beginning at offset 6 trying a match at offset 9 matched TGA beginning at offset 18
Give a man a fish: <%-{-{-{-<
In reply to Re^6: Understanding a portion of perlretut
by AnomalousMonk
in thread Understanding a portion on the Perlretut
by BlueStarry
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |