Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

capturing matching parenthesis

by lebe0024 (Novice)
on Apr 15, 2004 at 20:17 UTC ( [id://345524]=perlquestion: print w/replies, xml ) Need Help??

lebe0024 has asked for the wisdom of the Perl Monks concerning the following question:

I can't figure out a regular expression that will return a parenthesis block in a string, no matter how big/small it is and no matter if it contains other parenthesis blocks.

In otherwords, if I have a string like '( a ( b ( c ) ( d ) e ) )', I want a regular expression to capture the whole thing, not just '( a ( b ( c )' and not just '( c )'. Can you help O wise ones?

Replies are listed 'Best First'.
Re: capturing matching parenthesis
by davido (Cardinal) on Apr 15, 2004 at 20:23 UTC
    That's the sort of thing that Text::Balanced is good for. While more recent releases of Perl give the RE engine enough ammunition to accomplish the task, the Text::Balanced module is already a robust implementation designed for your type of problem.

    Look, in particular, at the 'extract_bracketed' method.


    Dave

Re: capturing matching parenthesis
by diotalevi (Canon) on Apr 15, 2004 at 20:21 UTC
Re: capturing matching parenthesis
by Stevie-O (Friar) on Apr 16, 2004 at 05:03 UTC
    There's also Regexp::Common::balanced. For example:
    use Regexp::Common qw(balanced); if ($foo =~ / ( $RE{balanced} ) /x) { print "grabbed $1"; }
    If you look at the documentation, you'll see that it can actually be used very flexibly -- it can handle proper nesting of mixed brackets (e.g. () and {}.

    Oh, and beware the lowercase 'b' -- took me a few confused tries to install it from CPAN till I noticed that caveat ;)

    --Stevie-O
    $"=$,,$_=q>|\p4<6 8p<M/_|<('=> .q>.<4-KI<l|2$<6%s!<qn#F<>;$, .=pack'N*',"@{[unpack'C*',$_] }"for split/</;$_=$,,y[A-Z a-z] {}cd;print lc
Re: capturing matching parenthesis
by muba (Priest) on Apr 15, 2004 at 21:05 UTC
    I'd say not to use a regexp. Just walk over the string byte by byte. Keep a $parenthesis variable. ++ it when you encounter a (, -- it when you encounter a ). Stop when $parenthesis has been >0 and now =0 again.

    Good luck.
      Fine, unless you care about the possibility of escaped parens. For example, '( ( \( two ) )' would fail with your method unless you specifically were watching for that sort of thing with additional logic.


      Dave

        Hmm, I was indeed not taking care of backwhacks.

        But then, the method would not change that much. Also keep track of another variable, let's call it $escape. Set it to 1 if you find a \. Then, in the next iteration of the loop, if that character is a ( or ) and $escape != 0, ignore it. If it is a backslash, ignore it too. Then set $escape back to 0.
Re: capturing matching parenthesis
by perlinux (Deacon) on Apr 16, 2004 at 08:18 UTC
    No regex in my mind :-( I think your string is a kind of tree with nodes, and your leaves are the most internal letters. It's not impossible a struct C-like and isolate every level of nodes and the leaves at the last level.

    ( a ( b ( c ) ( d ) e ) )

    Graphically:

         a
        / \
       b  e
      / \
     c   d

    Excuse me for my english
Re: capturing matching parenthesis
by melora (Scribe) on Apr 16, 2004 at 17:59 UTC
    The times I've had to cope with nested parentheses, I've used recursion -- call the recursion routine when you encounter the open paren (excellent point about the "\(" sequence, by the way), and return when you encounter the close paren (of course, there's the escape sequence there, too), resuming at the point beyond the close paren, in the original string.. I do like the idea of populating the tree structure with it, too, depending on the analysis you need to do on the whole expression. But then, I'm an old C programmer.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://345524]
Approved by diotalevi
Front-paged by broquaint
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (6)
As of 2024-04-23 10:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found