in reply to Regex keep matching the last possible match (but should get all)

Each greedy match, working left to right (an important point when you have several as in this case), swallows up as many characters as it can while still allowing the match to succeed. So:

$s = 'aaaaaaaaa'; $s =~ /.+(a+)/; # greedy match print "$1\n"; # prints "a" (a single 'a') $s = 'aaaaaaaaa'; $s =~ /.+?(a+)/; # non-greedy match print "$1\n"; # prints "aaaaaaaa" (all the 'a's)

So when your pattern is matching too much, start from the end and work your way backwards. Look at what each wildcard pattern (like .+) can match, and think about how to restrict it so it won't match too much. Is there a character that can't appear in it? If so, exclude that character. For instance, if it can't contain any HTML tags, you could use a wildcard like: [^<>]+? That will say, "as few characters as possible, not including angle brackets, while allowing the pattern to match."

Another suggestion: whenever your regex contains forward slashes, use a different regex delimiter so you don't have to backslash the slashes. Even better, learn to use whitespace in your regexes. See how much clearer the second one here is:

$text =~ /$mm\/$dd\/$yy/; # works, but ugly $text =~ m|$mm/$dd/$yy|; # use different delimiter $text =~ m| $mm # month / $dd # day / $yy # year |x; # allow whitespace

The third example might seem like overkill for such a simple regex, but the longer and more complicated they get, the more these methods help.

Aaron B.
Available for small or large Perl jobs and *nix system administration; see my home node.