REALLY interesting question I learned quite a bit myself on this one. I commented inside the code quite a bit saying what was happening as I went along. Basically it matters if you do a global match in scalar or list context. If you do a global match in SCALAR context, like this:
my $string =~ /(simple)/g;
The regex will only return the position of the FIRST match and stop searching. It will set the pos($string) value to the final position of the first match. If you run a global match AGAIN on that same string, it will not start from the beginning of the string, but from the ending position of the last match. Therefore a match with the /g modifier can and often will have side effects. The pos($string) value will reset on a failed match, or you can reset it manually via pos($string) = 0
However if you want to keep matching all the way to the END of the string and keep storing back references, you need to run the match in LIST context. Like this
my @matches = $string =~/(simple)/g
This will match all the way to the end of the string and find all instances of "simple". If you wanted to do the same thing in scalar context, you would have to use a while loop. Like this
while($string =~/(simple)/g){
my $postition = pos($string);
print "most recent match is $1, at position $position\n";
}
This will go all the way to the end of the string in SCALAR context and find all instances of "simple". In case that is confusing here is the entire code sample which is also commented and should hopefully explain the difference between using the /g modifier matching in SCALAR context vs LIST context.
#!/usr/bin/perl -w
=begin
running a global match on the same string twice can have side effects.
+ The pos($testString) or ending position of the last match changes a
+round behind the scenes. When using the /g modifier in scalar contex
+t, the next search will start from the position of the last match, no
+t the beginning of the string each time. To reset this position and
+restart from the beginning, you need a failed match or you can reset
+manually using the pos() function i.e. pos($test3) = 0 restarts match
+ing from the beginning of the string.
Also @- is the builtin array that contains the beginning position of a
+ll the matches if you were doing this manually like in a while loop
=end
=cut
my $test1 = $test2 = $test3 = "This is a simple thing, just a simple s
+imple thing.";
my @matches;
my $position;
print "String: \"$test1\"\n\n";
print "first test scalar context:\n";
$test1 =~ /(simple)/g;
$position = pos($test1);
print "\$1 is $1, pos(\$test1) is $position\n" if($1);
print "\$2 is $2, pos(\$test1) is $position\n" if($2);#no match becaus
+e scalar context /g only gets the first match
print "\$3 is $3, pos(\$test1) is $position\n" if($3);#no match becaus
+e (same)
print "\n";
print "second test list context:\n";
@matches = $test2 =~ /(simple)/g;#matches all three in list context, a
+pparently does not set pos. Cant use foreach loop must use while loo
+p if you needed the positions
my $i = 0;
for (@matches){
print "Match $i is $_\n";
$i++;
}
print "\n";
=begin
scalar context /g doesnt go to end of string, it stops at a match...
+Next search begins at the position of this match. To match all the w
+ay to end of a string, use list context or scalars in a loop structur
+e. A loop structure is useful if you needed the position of each mat
+ch which would be in pos($test3)
#https://www.oreilly.com/library/view/perl-in-a/1565922867/re148.html
=end
=cut
print "third test scalar context but with looping:\n";
$i=0;
while($test3 =~ /(simple)/g){
my $position = pos($test3);
print "\$1 is $1, pos(\$test3) is $position, loop counter is $i\n"
+ if($1);
print "\$2 is $2, pos(\$test3) is $position, loop counter is $i\n"
+ if($2);#no match because scalar context
print "\$3 is $3, pos(\$test3) is $position, loop counter is $i\n"
+ if($3);#no match because scalar context
#can reset position like this if you needed to pos($test3) = 0;
$i++;
if($i > 10){ last;}#watch out for infinite loops too if you reset
+the position in the while loop
}
print "\n";
The output looks like this
$perl simple.pl
String: "This is a simple thing, just a simple simple thing."
first test scalar context:
$1 is simple, pos($test1) is 16
second test list context:
Match 0 is simple
Match 1 is simple
Match 2 is simple
third test scalar context but with looping:
$1 is simple, pos($test3) is 16, loop counter is 0
$1 is simple, pos($test3) is 37, loop counter is 1
$1 is simple, pos($test3) is 44, loop counter is 2
|