SonicWang has asked for the wisdom of the Perl Monks concerning the following question:

I want to substract a wanted section between {}, like:

void fo { ........ { //first one f = fopen(...); ....... if(...) { } } //mapping one ........ }

and I want to get the <fopen> belonged {}:

{ //first one f = fopen(...); ....... if(...) { } } //mapping one
How to get it?

use m/({[^{]*fopen.*)/ can get the first { to fopen, but how to get the mapping }? Thanks a lot.

Replies are listed 'Best First'.
Re: a question about re
by Eliya (Vicar) on Apr 28, 2011 at 15:12 UTC

    In the general case, it's hard to do (just think of curlies within comments or string literals, for instance...).  But in simple, clearly circumscribed contexts, you might be able to get away with Text::Balanced, for example (instead of a 'real' parser).

Re: a question about re
by moritz (Cardinal) on Apr 28, 2011 at 15:04 UTC
    This is not easily done with a regex, because you'd have to count the number of opening and closing brackets. You can write recursive regexes that do such things, but that's generally not very much fun.

    How about using an existing parser for C instead?

Re: a question about re
by wind (Priest) on Apr 28, 2011 at 17:50 UTC

    The following utilizes the (?PARNO) feature introduced in perl 5.10 to create a recursive regex that captures balanced braces.

    It could fail if there are braces included within strings in strange ways, but it should work fine for the way most people code:

    use strict; use warnings; my $data = do {local $/; <DATA>}; if ($data =~ /(\s*{ [^{}]* \bfopen\b ((?: (?>[^{}]+) | {(?-1)} )*) })/ +x) { print $1; } __DATA__ void fo { ........ { //first one f = fopen(...); ....... if(...) { } } //mapping one ........ }
Re: a question about re
by wind (Priest) on Apr 29, 2011 at 03:55 UTC

    My first solution did not take into account cases where statements before the fopen but on the same block level. This was mostly because it would lead to the regex being even more complicated and I figured my example would lead you in the right directly if you wanted to go the regex route.

    However, upon closer inspection and trying to do the problem myself, I ran into efficiency problems with the regex never completing. It took quite a little finagling to get it to work, so I'm also sharing this solution:

    use strict; use warnings; my $data = do {local $/; <DATA>}; our $braces_re = qr/(\{ (?: (?>[^{}]+) | (?-1) )* \})/x; while ($data =~ /(\{ (?: (?>(?:(?!\bfopen\b)[^{}])+) | (?> $braces_re ) )* \bfopen\b (?: (?>[^{}]+) | (?> $braces_re ) )* \})/xg) { print "'$1'\n"; } __DATA__ void fo { ........ { //first one f = fopen(...); ....... if(...) { } } //mapping one ........ } void bar { ........ { //first one if(...) { for { if { } else { } } f = fopen(...); while { } } ....... if(...) { } } //mapping one ........ }