ThreeMonks has asked for the wisdom of the Perl Monks concerning the following question:

I wrote the following code, trying to match balanced parentheses with a given ending. It produces a strange result of '(1, 2)', while I am expecting '(4, 5)'. Would someone tell me what is going wrong? I am using ActivePerl 5.10.0 build 1004.
use strict; my $a = '(0,((1,2),3,(4,5),6),7,8)'; my $tail = qr{5\)}; my $matching_parens = qr{ ( # paren group 1 (parens) \( (?: (?> [^()]+ ) # Non-parens without backtracking | (?1) # Recurse to start of paren group 1 )* \) ) .* (?<= $tail ) }x; $a =~ /$matching_parens/; print "$1\n";

Replies are listed 'Best First'.
Re: Strange regex result
by moritz (Cardinal) on Jul 08, 2009 at 15:48 UTC
    The .* (?<= $tail ) part (assuming that the \) is an error, you already have one in $tail) means "grab as many characters as possible, and then backtrack until you find the string 5) before the current position".

    So the paren group 1 matches (1, 2), the .* matches ,3,(4,5), and the regex engine is happy to have found a match and buys demerphq and dave_the_m a beer - or so I hope.

    If you want to match (4, 5), omit the .*.