temporal has asked for the wisdom of the Perl Monks concerning the following question:

I want to find all regex matches within a string. The problem is that sometimes parts of one match are parts of another and I think this is throwing things off.

Example:

$test = "xTx\nxxTxxT"; $rx = 'x...T'; @matches = $test =~ /$rx/sg; printf "match #%i:\n%s\n",++$i,$_ for @matches;

The code above finds only 1 match when there are actually 2 matches within the string. Perl grabs the first match "x\nxxT" and then I think it starts looking for a new match where that last one ended. What I'd like to do is also get the other match - "xTxxT".

Is there any way to get Perl to check the entire string for each greedy regex pass excluding any previously found patterns?

Replies are listed 'Best First'.
Re: Regex Greed
by jwkrahn (Abbot) on Aug 07, 2012 at 21:12 UTC
    $ perl -e' my $test = "xTx\nxxTxxT"; my $rx = qr/(?=(x...T))/s; my @matches = $test =~ /$rx/g; print "match #", $_ + 1, ":\n$matches[$_]\n" for 0 .. $#matches; ' match #1: x xxT match #2: xTxxT
      To add a little detail, jwkrahn is using a look ahead so the actual match itself is zero-width. See Looking ahead and looking behind in perlretut.

      #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

        Unfortunately, the referenced section does not discuss the zero-width-lookahead-to-a-capture trick of jwkrahn's solution. Does anyone know where this is covered in the standard docs (as opposed to a PerlMonks node)?

Re: Regex Greed
by temporal (Pilgrim) on Aug 07, 2012 at 21:45 UTC

    Thanks guys, exactly what I was looking for. Always wondered when I'd have to come back and read the regex docs more closely.

    Strange things are afoot at the Circle-K.

      Keep re-reading them.   You will always learn something new, and you will never be disappointed or feel that you have wasted your time.

        Keep re-reading them.

        Yea and amen to that, brother! And very often the 'new' thing you learn will be something you forgot five minutes after the last time you read it.