in reply to Regexp matching words, not doing what I expect

The /g modifier prevents you from matching overlapping pieces of texts, so don't do that.

A workaround is to only match the first word normally, and match the second one in a look-ahead:

my $text = "hello to all the perl monks"; while ($text =~ /\b([A-Za-z'\-]+) (?=([A-Za-z'\-]+))\b/g) { print "$1 $2\n"; }

(Gives the desired output).

The key is that look-ahead groups (?=...) match, but don't consume any characters, so the position of the next match is not affected by what that group matched. See perlre for details, or "Mastering Regular Expression" by J. Friedl.

Replies are listed 'Best First'.
Re^2: Regexp matching words, not doing what I expect
by cosmicperl (Chaplain) on May 14, 2009 at 09:46 UTC
    Thanks Moritz, look-aheads are new to me. This knowledge is going to prove very useful thanks :)