John M. Dlugosz has asked for the wisdom of the Perl Monks concerning the following question:

If I use while ($line =~ /$re/pgc) { ... subsequent iterations will return different ways of matching the $re, including different ways starting at the same location, or moving forward a character at a time from the start.

I want to start searching where the previous match ended. But I don't want to match only at that position, like the \G anchor would have. Rather, I want to tell it to start trying to match there and continue forward a char at a time, as if I replaced $line with ${^POSTMATCH} only more efficiently than copying the string or deleting stuff from the beginning of the string.

The pos($line) doesn't do that.

Replies are listed 'Best First'.
Re: Regex: continue from previous match
by ikegami (Patriarch) on Apr 23, 2011 at 07:55 UTC

    Then don't use \G

    while ('abxxxacxxxad' =~ /a(.)/g) { say $1; }
      That's what I have (no \G) and it doesn't work. Your example of an 'a' followed by any one char will not match variable length or different ways at the same position, so it's moot. Put two a's in a row in your test string and you'll see overlapping matches.

       perl -E'while ("abaaxxxacxxad" =~ /a(.)/g) { say $& }'

        Your example of an 'a' followed by any one char will not match variable length

        Sure it works with variable length patterns.

        while ('abxxxacdxxxae' =~ /a([^x]+)/g) { say $1; }

        Your example of an 'a' followed by any one char will not match [...] different ways at the same position

        Cause you didn't ask that. You asked about matching no earlier than pos.

        The only way to make the regex engine backtrack is to cause the match to fail.

        'abcd' =~ /(.*) (?{ say $1 }) (?!)/sx
Re: Regex: continue from previous match
by moritz (Cardinal) on Apr 23, 2011 at 07:58 UTC
    But I don't want to match only at that position, like the \G anchor would have.

    Then don't use \G. Without \G the construct does exactly what you want:

    use 5.010; $_ = 'aaababaab'; while (/(a+)/pcg) { say pos($_), " ", $1; } __END__ 3 aaa 5 a 8 aa
      Hmm, Why is it behaving differently?! Ahh, I think my problem concerns my use of PREMATCH, not the matching itself.

      PREMATCH is giving me the whole beginning of the string to the next match, not the part between the last search result and the next.

      use 5.10.1; use utf8; my $simples = qr{\*\*|//}; my $ps= qr/ (?<simple>$simples)(?<body>.*?)\k<simple> /x; my @results; my $line= q[You can make things **bold** or //italic// or **//both//** + or //**both**//.]; say "Original: [$line]"; while ($line =~ /$ps/pgc) { say "pos is " . pos($line); say "PREMATCH: (${^PREMATCH})" unless length(${^PREMATCH}) == 0; say "SIMPLE: ($+{simple}) ($+{body})"; }
      I guess to find the stuff between matches as well as the matches, I need to put it explicitly within a capture too. A lazy match of anything first.

      Thanks.

        I guess to find the stuff between matches as well as the matches, I need to put it explicitly within a capture too.

        Either do \G(.*?)$your_regex_here and have the stuff between matches in $1, or you could just store the previous value of pos + length(match), and then get the stuff between matches with substr.