in reply to regex assistance for parsing arg list

Having a function call in the argument list is not your only problem. An argument can be any arbitrary expression, and many of them include "own" commas. The simplest is probably just a string; funca("this is just one argument, I think"). But then there are many other common argument expressions, like anonymous arrays or hashes, method calls, map/grep that doesn't use a "regular" function syntax and has a block that may and often does use a comma (in any form or shape), which leads me into mentioning the => operator. Etc etc. It's up to you to decide how detailed you want to be. You'll quite likely end up with a little parser, far beyond simple one-line regexes. For that I recommend using Parse::DecDescent, or at least looking into it to see if it fits you. Parse::RecDescent provides means to extract code blocks, several quote constructs, including qr//, qq!!, etc.

If you decide to stay with regexes or a hand-rolled parser, you should look into recursive regexes, see the perlre manpage.

Another approach would be to first try to find something that looks like a function call, by scanning for function names and then just grabbing everything inside the parenthesis. You'd do this by "balanced matching", i.e. you match all of "(1,(2,3),4)". Again, look in perlre for an example of this. But you need to specifically handle strings, since they might contain any kind of brackets. After than you can try to parse the argument elements. If you fail, handle the specifically. Then you'll at least be aware of that there are "strange" function calls.

This method is far from perfect, but perhaps it's good enough for you.

Hope I've helped,
ihb

Replies are listed 'Best First'.
Re: Re: regex assistance for parsing arg list
by denap (Initiate) on Feb 04, 2003 at 19:25 UTC
    yes, you've helped, thanks for that. I'll look at the references and see where I get. fortunately for me I *know* that the most complex embedded func will be of the form func(1,2,func(A,B),3). No embedded strings with commas, brackets etc.