Here's another way to look at things: instrument the regex with  (?{ code }) (see Extended Patterns) print points to learn by experimentation. I'm also taking the liberty of introducing some other new constructs: the  (?:pattern) non-capturing grouping (also see Extended Patterns); the  /x regex modifier (all the preceding links found in perlre); and the  @- (aka @LAST_MATCH_START) array regex special variable (see perlvar).

First look at TGA matching against a simplified string without a  \G anchor. Note that in contrast to some other code examples in this thread, the beginning offset of a match is reported.

c:\@Work\Perl>perl -wMstrict -le "my $s = 'XXXxxxTGAxxTGAxxxxxxx'; while ($s =~ m{ (?{ print qq{trying a match at offset }, pos $s }) (? +: \w\w\w)*? (TGA) }xmsg) { print qq{matched TGA beginning at offset $-[1]}; } " trying a match at offset 0 matched TGA beginning at offset 6 trying a match at offset 9 trying a match at offset 10 trying a match at offset 11 matched TGA beginning at offset 11
After the successful TGA match at offsets 6 thru 8, the regex engine starts trying to match again at offset 9. The RE tries matches at offsets 9, 10 and 11 and finds a spurious (because it's not on a base-triplet boundary) match at offset 11-13. (I'm not sure why the RE doesn't try matching from offset 14 onward.)

Now consider the effect of adding a  \G anchor assertion.

c:\@Work\Perl>perl -wMstrict -le "my $s = 'XXXxxxTGAxxTGAxxxxxxx'; while ($s =~ m{ \G (?{ print qq{trying a match at offset }, pos $s }) + (?: \w\w\w)*? (TGA) }xmsg) { print qq{matched TGA beginning at offset $-[1]}; } " trying a match at offset 0 matched TGA beginning at offset 6 trying a match at offset 9
Now the RE can only begin another successful match at the offset immediately beyond the point at which the previous successful match ended, offset 9; it cannot try offsets 10 or 11 or any other because they do not satisfy the  \G assertion.

Supplemental: We just got finished saying that in

my $s = 'XXXxxxTGAxxTGAxxxxxxx'; while ($s =~ m{ (?{ print qq{trying a match at offset }, pos $s }) (?: + \w\w\w)*? (TGA) }xmsg) { print qq{matched TGA beginning at offset $-[1]}; }
the RE will match the TGA at offset 11 because it's not constrained by a  \G assertion. So in
c:\@Work\Perl>perl -wMstrict -le "my $s = 'XXXxxxTGAxxTGAxxxxTGAxx'; while ($s =~ m{ (?{ print qq{trying a match at offset }, pos $s }) (? +: \w\w\w)*? (TGA) }xmsg) { print qq{matched TGA beginning at offset $-[1]}; } " trying a match at offset 0 matched TGA beginning at offset 6 trying a match at offset 9 matched TGA beginning at offset 18
(still no \G), why does the RE miss the TGA at offset 11 when there is another TGA present at offset 18 (which it does match)?


Give a man a fish:  <%-{-{-{-<


In reply to Re^6: Understanding a portion of perlretut by AnomalousMonk
in thread Understanding a portion on the Perlretut by BlueStarry

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.