Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

hello monks,

i have a string : concat("this","is good") I need to store "this" and "is good" in two variables. it is simple if i have string like above but some times i have string like this : concat(int2string(counter, 1),result)

how can i put int2string(counter,1) in $1 and result in $2 ?

thanks in advance.

edited: Fri Apr 11 22:07:09 2003 by jeffa - title change (was: "how to match concat(int2string(cnt,1),result)")

  • Comment on Need a regex to match C-style function arg list

Replies are listed 'Best First'.
(jeffa) Re: Need a regex to match C-style function arg list
by jeffa (Bishop) on Apr 11, 2003 at 21:54 UTC
    Should you wish to take the Parse::RecDescent leap, here is some code you can play around with. The first major limitation i see with this code is that it only grabs the first two args from a function. That means if you have some functions that have more than two args, those functions first two args will be processed too. There may be more limitations that i did not think of during my limited testing of this code, but hey ... it's a start. ;)
    use strict; use warnings; use Parse::RecDescent; use vars qw( %item @item $return @pair @store ); my $data = do {local $/;<DATA>}; my $parser = Parse::RecDescent->new($data); my @code = ( q/concat(int2string(counter, 1),result)/, q/concat("hello","world")/, q/concat(foo,bar)/, ); $parser->startrule($_) for @code; for (0..$#code) { print "$code[$_]\n"; print "\targ 1 => ", $store[$_][0], "\n"; print "\targ 2 => ", $store[$_][1], "\n"; } __DATA__ startrule: function function: label open_p arg comma arg close_p arg: nested_function | quoted_literal | literal nested_function: label open_p arg comma arg close_p { push @main::pair, join('', @item[1..$#item]); pop @main::store; if (@main::pair == 2) { push @main::store,[@main::pair]; @main::pair = (); } $return = join('', @item[1..$#item]); print STDERR "nested function: $return\n"; } quoted_literal: quote /[^"]+/ quote { push @main::pair, $item[2]; if (@main::pair == 2) { push @main::store,[@main::pair]; @main::pair = (); } print STDERR "quoted literal: $item[2]\n"; $return = $item[2]; } literal: /\w+/i { push @main::pair, $item[1]; if (@main::pair == 2) { push @main::store,[@main::pair]; @main::pair = (); } print STDERR "literal: $item[1]\n"; $return = $item[1]; } label: /[a-zA-Z]\w*/ open_p: /\(/ close_p: /\)/ comma: /,/ quote: /"/

    jeffa

    L-LL-L--L-LL-L--L-LL-L--
    -R--R-RR-R--R-RR-R--R-RR
    B--B--B--B--B--B--B--B--
    H---H---H---H---H---H---
    (the triplet paradiddle with high-hat)
    
Re: Need a regex to match C-style function arg list
by dws (Chancellor) on Apr 11, 2003 at 20:41 UTC
    The difficulty you face is in deciding which of the commas in
    concat(int2string(counter,1),result)
    is the one that separates the arguments. In the trivial case of two strings, this seems easy, except that the strings themselves might contain commas. But when you introduce arbitrary expressions into the mix, you're past the point where you can deal with this problem safely using a regular expression, and into territory where you need a parser to correctly distinguish the two arguments.

    Can you say a bit more about the types of expressions that might occur as arguments to concat()? If there's a limited set, you can special-case them with one regular expression per combination.

Re: Need a regex to match C-style function arg list
by BrowserUk (Patriarch) on Apr 11, 2003 at 22:34 UTC

    Given the samples supplied, this does the trick. If your expressions can contain spaces, be split across lines or contain nested concats, then it will need further work.

    #! perl -slw use strict; my $re_term = qr[[^,)]+?]; my $re_func = qr[\w+\($re_term,$re_term\)]; my $re_concat = qr[concat\((?:($re_func),($re_term|$re_func))\)]; while(<DATA>) { s[^$re_concat$][$1 & $2]; print; } __DATA__ concat(int2str(count,1),result) concat(bit2str(count,'10'B),result) concat(int2str(count,1),bit2str(count,'10'B))

    Gives

    c:\test>249976 int2str(count,1) & result bit2str(count,'10'B) & result int2str(count,1) & bit2str(count,'10'B)

    Examine what is said, not who speaks.
    1) When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
    2) The only way of discovering the limits of the possible is to venture a little way past them into the impossible
    3) Any sufficiently advanced technology is indistinguishable from magic.
    Arthur C. Clarke.
Re: Need a regex to match C-style function arg list
by Anonymous Monk on Apr 11, 2003 at 21:14 UTC
    thanks for the reply monks. to explain you the scenario little bit more, i m making an application which converts a program written in an earlier version of test language to the new version of the language. so one of the few changes in the language is the function 'concat' which has been changed to '&'. so concat("this", "is good") will be changed to ("this" & "is good") and i m facing the exact problem mentioned by u monks. its hard to know the exact pattern. lets say that concat can take following patterns for example : concat(int2str(count,1),result) or concat(bit2str(count,'10'B),result) or concat(int2str(count,1),bit2str(count,'10'B)) do you think it is possible to find a way to solve my problem ?
      Sounds like a great job for Parse::RecDescent, except that you'll still need to give us access to the spec for "an earlier version of the test language". We specifically would need to know string-quoting-thingies that can hide a comma (like Perl's "" or '' or q()) and the nesting-thingies that can hide a comma recursively (like your concat function).

      Until you provide that spec, any response here is pure speculation.

      -- Randal L. Schwartz, Perl hacker
      Be sure to read my standard disclaimer if this is a reply.

Re: Need a regex to match C-style function arg list
by Mr. Muskrat (Canon) on Apr 11, 2003 at 20:22 UTC

      Or maybe this is JavaScript. It'd help to know what language you're expecting us to read.