Regex: continue from previous match

John M. Dlugosz has asked for the wisdom of the Perl Monks concerning the following question:

If I use while ($line =~ /$re/pgc) { ... subsequent iterations will return different ways of matching the $re, including different ways starting at the same location, or moving forward a character at a time from the start.

I want to start searching where the previous match ended. But I don't want to match only at that position, like the \G anchor would have. Rather, I want to tell it to start trying to match there and continue forward a char at a time, as if I replaced $line with ${^POSTMATCH} only more efficiently than copying the string or deleting stuff from the beginning of the string.

The pos($line) doesn't do that.

Comment on Regex: continue from previous match Download Code

Replies are listed 'Best First'.
Re: Regex: continue from previous match by ikegami (Patriarch) on Apr 23, 2011 at 07:55 UTC
Then don't use `\G` `while ('abxxxacxxxad' =~ /a(.)/g) { say $1; }` [download]	[reply] [d/l] [select]
Re^2: Regex: continue from previous match by John M. Dlugosz (Monsignor) on Apr 23, 2011 at 08:02 UTC
That's what I have (no \G) and it doesn't work. Your example of an 'a' followed by any one char will not match variable length or different ways at the same position, so it's moot. Put two a's in a row in your test string and you'll see overlapping matches. `perl -E'while ("abaaxxxacxxad" =~ /a(.)/g) { say $& }'`	[reply] [d/l]
Re^3: Regex: continue from previous match by ikegami (Patriarch) on Apr 23, 2011 at 08:07 UTC
Your example of an 'a' followed by any one char will not match variable length Sure it works with variable length patterns. `while ('abxxxacdxxxae' =~ /a([^x]+)/g) { say $1; }` [download] Your example of an 'a' followed by any one char will not match [...] different ways at the same position Cause you didn't ask that. You asked about matching no earlier than `pos`. The only way to make the regex engine backtrack is to cause the match to fail. `'abcd' =~ /(.*) (?{ say $1 }) (?!)/sx` [download]	[reply] [d/l] [select]
Re: Regex: continue from previous match by moritz (Cardinal) on Apr 23, 2011 at 07:58 UTC
But I don't want to match only at that position, like the \G anchor would have. Then don't use \G. Without \G the construct does exactly what you want: `use 5.010; $_ = 'aaababaab'; while (/(a+)/pcg) { say pos($_), " ", $1; } __END__ 3 aaa 5 a 8 aa` [download] Perl 6 - second systems done right	[reply] [d/l]
Re^2: Regex: continue from previous match by John M. Dlugosz (Monsignor) on Apr 23, 2011 at 08:13 UTC
Hmm, Why is it behaving differently?! Ahh, I think my problem concerns my use of PREMATCH, not the matching itself. PREMATCH is giving me the whole beginning of the string to the next match, not the part between the last search result and the next. `use 5.10.1; use utf8; my $simples = qr{\\\|//}; my $ps= qr/ (?<simple>$simples)(?<body>.?)\k<simple> /x; my @results; my $line= q[You can make things bold* or //italic// or //both// + or //both//.]; say "Original: [$line]"; while ($line =~ /$ps/pgc) { say "pos is " . pos($line); say "PREMATCH: (${^PREMATCH})" unless length(${^PREMATCH}) == 0; say "SIMPLE: ($+{simple}) ($+{body})"; }` [download] I guess to find the stuff between matches as well as the matches, I need to put it explicitly within a capture too. A lazy match of anything first. Thanks.	[reply] [d/l]
Re^3: Regex: continue from previous match by moritz (Cardinal) on Apr 23, 2011 at 08:39 UTC
I guess to find the stuff between matches as well as the matches, I need to put it explicitly within a capture too. Either do `\G(.*?)$your_regex_here` and have the stuff between matches in $1, or you could just store the previous value of pos + length(match), and then get the stuff between matches with substr. Perl 6 - second systems done right	[reply] [d/l]