Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Matching parens on a regex is beating me

by bfdi533 (Friar)
on Apr 22, 2008 at 18:13 UTC ( [id://682241]=perlquestion: print w/replies, xml ) Need Help??

bfdi533 has asked for the wisdom of the Perl Monks concerning the following question:

I feel pretty good with my regex foo but am running into a difficult one that I could use some help on.

I have a string that needs to be reformatted and it contains functions that can call functions. e.g.

func2(func1(input1, input2), input3)

But it does not matter which order they are called in. So, it could also be

func1(input1, func2(input2, input3))

So, my dilemma is that if I do a regex to find func1 and replace it then I am left with func2 to deal with. The order matters as to the matching of parens. I have not yet figured out how to get a matching () regex to work properly and have read all of the posts here already.

Here is my code to date:

while (<>) { $line = $_; $line =~ s/func1\((.*?)\)/proc1{$1}/g; $line =~ s/func2\((.*?)\)/FUNC $1 END/g; $line =~ s/{/(/g; $line =~ s/}/)/g; print $line; }

The problem is that I get this when I run it I get this:

proc1(input1, FUNC input2, input3) END FUNC input1, proc1(input2, input3) END
rather than
proc1(input1, FUNC input2, input3 END) FUNC input1, proc1(input2, input3) END

Just in case it is not easy to tell the difference I am getting ") END" rather than "END)".

Any help or pointers on this?

Replies are listed 'Best First'.
Re: Matching parens on a regex is beating me
by Roy Johnson (Monsignor) on Apr 22, 2008 at 18:33 UTC
    perldoc -q matching:

    How do I find matching/nesting anything?

    This isn't something that can be done in one regular expression, no matter how complicated. To find something between two single characters, a pattern like /x([^x]*)x/ will get the intervening bits in $1. For multiple ones, then something more like /alpha(.*?)omega/ would be needed. But none of these deals with nested patterns. For balanced expressions using (, {, [ or < as delimiters, use the CPAN module Regexp::Common, or see "(??{ code })" in perlre. For other cases, you'll have to write a parser.

    That said, for your situation, you can repeatedly find the innermost expression and replace it using a dispatch table:
    my %arg_xlate = ( func1 => sub { "proc1{$_[0]}" } , func2 => sub { "FUNC $_[0] END" } ); while (<DATA>) { my $line = $_; 0 while ($line =~ s/(func[12])\(([^()]*)\)/$arg_xlate{$1}($2)/ge); print $line; } __DATA__ func1(input1, func2(input2, input3)) func2(func1(input1, input2), input3)

    Caution: Contents may have been coded under pressure.
Re: Matching parens on a regex is beating me
by jettero (Monsignor) on Apr 22, 2008 at 18:51 UTC
    Someone relatively recently pointed me to Text::Balanced. That's a non-regex (er... partially regex based?) way to do it.

    -Paul

Re: Matching parens on a regex is beating me
by johngg (Canon) on Apr 22, 2008 at 18:42 UTC
    You may be able to use this code in your solution. It finds nested brackets in strings but the match fails if the brackets aren't balanced. The bracket groups are shown in depth first, left to right order.

    use strict; use warnings; use re q{eval}; my @codeStrings = ( q{func2(func1(input1, input2), input3)}, q{func1(input1, func2(input2, input3))}, q{Contains i(mbalan(ced Br(ack)ets, )one op)en m)missing}, q{So(me m(ultip)le n(est(s in) thi)s o)ne}, ); my @memoList; my $rxNest; $rxNest = qr {(?x) ( \( [^()]* (?: (??{$rxNest}) [^()]* )* \) ) (?{ [ @{$^R}, $^N ] }) }; my $rxOnlyNested; { $rxOnlyNested = qr {(?x) (?{ [] }) ^ [^()]* (?: $rxNest [^()]* )+ \z (?{ @memoList = @{$^R} }) }; } testString($_) for @codeStrings; sub testString { my $string = shift; @memoList = (); print qq{\nString: $string\n}; if($string =~ /$rxOnlyNested/) { print qq{ Match succeeded\n}; print qq{ ---------------\n}; print qq{ Before brackets:-\n}; print qq{ -->@{[substr $string, 0, $-[1]]}<--\n}; print qq{ Bracket pairs:-\n}; print qq{ $_\n} for @memoList; print qq{ After brackets:-\n}; print qq{ -->@{[substr $string, $+[1]]}<--\n}; } else { print qq{ Match failed\n}; } }

    The output.

    String: func2(func1(input1, input2), input3) Match succeeded --------------- Before brackets:- -->func2<-- Bracket pairs:- (input1, input2) (func1(input1, input2), input3) After brackets:- --><-- String: func1(input1, func2(input2, input3)) Match succeeded --------------- Before brackets:- -->func1<-- Bracket pairs:- (input2, input3) (input1, func2(input2, input3)) After brackets:- --><-- String: Contains i(mbalan(ced Br(ack)ets, )one op)en m)missing Match failed String: So(me m(ultip)le n(est(s in) thi)s o)ne Match succeeded --------------- Before brackets:- -->So<-- Bracket pairs:- (ultip) (s in) (est(s in) thi) (me m(ultip)le n(est(s in) thi)s o) After brackets:- -->ne<--

    I hope this helps you.

    Cheers,

    JohnGG

    Update: I've found the post this originally came from, diotalevi, ikegami and hv gave me lots of help in finding this solution.

Re: Matching parens on a regex is beating me
by terjek (Sexton) on Apr 23, 2008 at 14:52 UTC
    Found an example of an regexp to do this in perldoc perlre, and implemented it. should work.
    #!perl -l use strict; my @a=("func2(func1(input1, input2), input3)", "func1(input1, func2(in +put2, input3))"); my $re = ""; $re=qr{\((((?>[^()]+)|(??{ $re }))*)\)}; for my $line (@a) { $line =~ s/func1$re/proc1{$1}/g; $line =~ s/func2$re/FUNC $1 END/g; $line =~ tr/{}/()/; print $line; }
    Terje
Re: Matching parens on a regex is beating me
by bfdi533 (Friar) on Apr 23, 2008 at 03:55 UTC

    I wanted to say thanks for the input and suggestions. I was hoping for the continued use of a 1-line regex as, to be honest, this is a proof of concept that I will have to translate to C# and .NET.

    I am still digesting the replies and will see what can be done.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://682241]
Approved by kyle
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (5)
As of 2024-04-19 22:36 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found