Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Regex Greed

by temporal (Pilgrim)
on Aug 07, 2012 at 20:48 UTC ( [id://986082]=perlquestion: print w/replies, xml ) Need Help??

temporal has asked for the wisdom of the Perl Monks concerning the following question:

I want to find all regex matches within a string. The problem is that sometimes parts of one match are parts of another and I think this is throwing things off.

Example:

$test = "xTx\nxxTxxT"; $rx = 'x...T'; @matches = $test =~ /$rx/sg; printf "match #%i:\n%s\n",++$i,$_ for @matches;

The code above finds only 1 match when there are actually 2 matches within the string. Perl grabs the first match "x\nxxT" and then I think it starts looking for a new match where that last one ended. What I'd like to do is also get the other match - "xTxxT".

Is there any way to get Perl to check the entire string for each greedy regex pass excluding any previously found patterns?

Replies are listed 'Best First'.
Re: Regex Greed
by jwkrahn (Abbot) on Aug 07, 2012 at 21:12 UTC
    $ perl -e' my $test = "xTx\nxxTxxT"; my $rx = qr/(?=(x...T))/s; my @matches = $test =~ /$rx/g; print "match #", $_ + 1, ":\n$matches[$_]\n" for 0 .. $#matches; ' match #1: x xxT match #2: xTxxT
      To add a little detail, jwkrahn is using a look ahead so the actual match itself is zero-width. See Looking ahead and looking behind in perlretut.

      #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

        Unfortunately, the referenced section does not discuss the zero-width-lookahead-to-a-capture trick of jwkrahn's solution. Does anyone know where this is covered in the standard docs (as opposed to a PerlMonks node)?

Re: Regex Greed
by temporal (Pilgrim) on Aug 07, 2012 at 21:45 UTC

    Thanks guys, exactly what I was looking for. Always wondered when I'd have to come back and read the regex docs more closely.

    Strange things are afoot at the Circle-K.

      Keep re-reading them.   You will always learn something new, and you will never be disappointed or feel that you have wasted your time.

        Keep re-reading them.

        Yea and amen to that, brother! And very often the 'new' thing you learn will be something you forgot five minutes after the last time you read it.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://986082]
Approved by toolic
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (4)
As of 2024-04-19 21:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found