in reply to Re^4: Regular expression
in thread Regular expression

Ohh my bad. I corrected G with \G but still no chnage ... prints nothing. Still not sure what difference it was supposed to make.

c:\@Work\Perl\monks>perl -wMstrict -le "my $x = '1 2 3kg 4 5 6 7 8 9 10Kg 11 12 13 kg 14 15'; print qq{string \$x: '$x'}; ;; printf qq{captured '$1' } while $x =~ /\G(\d+)\s*kg\s*/ig; print '----------'; " string $x: '1 2 3kg 4 5 6 7 8 9 10Kg 11 12 13 kg 14 15' ----------
The  \G anchor matches at the point (the exact character offset in the string) at which matching stopped in the last  /g global match iteration. But on the first  /g iteration, where is that point? On the first  /g iteration,  \G matches the same as  \A (the \Absolute-start-of-string anchor).

So what  /\G(\d+)\s*kg\s*/ig says is:

  1. \G From the string offset at which the previous match stopped (or from the start of the string if it's the first match);
  2. (\d+) Match and capture one or more decimal digits;
  3. \s* Then match zero or more whitespace characters;
  4. kg Then match the literal characters  'kg' case-insensitively (due to the  /i flag);
  5. \s* Then match zero or more whitespace characters (this can't fail);
  6. And this match iteration is finished.

But your  '1 2 3kg 4 5 6 7 8 9 10Kg 11 12 13 kg 14 15' string begins with some digits, some whitespace, and then some more digits, not the required  'kg' literals: the match immediately fails. There is a  '3kg' subsequence further on that could satisfy part of the overall match, but matching has already failed due to the  \G assertion.


Give a man a fish:  <%-{-{-{-<

Replies are listed 'Best First'.
Re^6: Regular expression
by pravakta (Novice) on Nov 02, 2017 at 20:02 UTC
    my $test_string= '12345'; print "$1\n" if ($test_string=~ /(2)/g); #Actually printed 2 print "$1\n" while ($test_string=~ /\G(\d)/g);# Printed every thing af +ter 2 i.e 3,4,5

    Thanks Anomalous for this explanation. If I understand you correctly then essentially you mean \G is a kind of anchor (like ^ and $) but instead of having a fixed location, its position depends on where last match happened. As in my code above first print statement printed only 2 and then for next print statement pattern match started in the string from location next to 2 so 3,4,5 matched.I hope my understanding is correct so far.

    Can you tell me what would be the scope of \G. Supposes in next print statement I use a different string, then also \G would will make pattern match to start from a position where it last matched in string one? what if second string is smaller than string in first print line and \G has a value greater than second string size?

      ... what would be the scope of \G. Supposes in next print statement I use a different string, then also \G would will make pattern match to start from a position where it last matched in string one?

      Each individual string has an independent "position of end of last successful match" attribute that is returned by the pos built-in. The  \G regex operator (enabled by the  /g modifier) accesses this attribute of a string being matched to assert that matching in that string is continuing where previous matching in that string by any  m//g match left off.

      c:\@Work\Perl\monks>perl -wMstrict -le "my $s1 = 'foobarfeefiefoefum'; $s1 =~ /foo/g; ;; my $s2 = '123456789'; $s2 =~ /6/g; ;; print qq{A: pos in \$s1 '$s1' after successful match == }, pos $s1; print qq{B: pos in \$s2 '$s2' after successful match == }, pos $s2; ;; $s1 =~ /foe/g; print qq{C: pos in \$s1 '$s1' after successful match == }, pos $s1; print qq{D: pos in \$s2 '$s2' still == }, pos $s2; " A: pos in $s1 'foobarfeefiefoefum' after successful match == 3 B: pos in $s2 '123456789' after successful match == 6 C: pos in $s1 'foobarfeefiefoefum' after successful match == 15 D: pos in $s2 '123456789' still == 6

      What would have happened if the second match against the  $s1 string had been  /\Gfoe/g instead? Or  /\Gbar/g instead? Try it and see! (See also the documentation concerning the effect of the  /c modifier in conjunction with  /g in a  m//gc match.)

      (Incidentally, what if  $test_string in your example code was  '12xxx345' and the match was  /(\d)/g (no \G) instead? What if it was  /\G(\d)/g as originally?)


      Give a man a fish:  <%-{-{-{-<