ascetic has asked for the wisdom of the Perl Monks concerning the following question:

I have searched cpan and here and haven't found a good answer to this basic question. I have a string which has comma separated values, but there is nesting within parens.
$str = 'sue,fred,x(mary,jane)';
I want to break this string into a three elements...
['sue','fred','x(mary,jane)']

where, 'mary' and 'jane' are two args of function x which I am going to resolve by processing the third item later on

Not four elements...
['sue','fred','x(mary','jane)']
I have tried several ideas like splitting and then recombining adjacent elements that should not have been split, but was hoping for a cleaner solution.

Replies are listed 'Best First'.
Re: csv with parens that include commas
by Anonymous Monk on Dec 05, 2010 at 00:48 UTC
    FWIW, that is bogus csv

    I would preprocess with

    my $csv = Text::CSV->new({ qw! allow_loose_escapes 1 allow_loose_quotes 1 ! }); ... s/x\(([^\(\)]+)\)/"$1"/g; $csv->parse($_);
    Seems to work
    $ perl -MText::CSV -e" $t = Text::CSV->new({qw! auto_diag 1 allow_loos +e_escapes 1 allow_loose_quotes 1 ! }); $_ = q!sue,fred,x(mary,jane)!; + s/x\(([^\(\)]+)\)/\x22$1\x22/g; warn $_; warn $t->parse( $_ ); warn$ +_ for $t->fields" sue,fred,"mary,jane" at -e line 1. 1 at -e line 1. sue at -e line 1. fred at -e line 1. mary,jane at -e line 1.

      I want the third item to be the string

      x(mary,jane)

      i.e. mary and jane are two args of function x which I am going to resolve by processing the third item later on

      P.S. Although this may not meet the technical definition of "csv" it was useful in the title.

        Yes, and? You want to treat something that is not CSV as CSV. Thats a no-go situation.
        A reply falls below the community's threshold of quality. You may see it by logging in.
Re: csv'ish string with parens that include commas
by BrowserUk (Patriarch) on Dec 05, 2010 at 05:02 UTC

    for( 'sue,fred,x(mary,jane)', 'x(a,b),c(d,e,f),g,h,i(j,k)', 'a,b,cdef(g),h,i(j)' ) { print join' | ', m/( [^,(]+ \( [^)]+ \) | [^,]+ )(?:$|,)/gx; };; sue | fred | x(mary,jane) x(a,b) | c(d,e,f) | g | h | i(j,k) a | b | cdef(g) | h | i(j)

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      m/\G( [^,()]* \( [^()]* \) | [^,()]+ )(?:,|$)/g­x;

        Is that intended as an improvement? What purpose does the \G serve?

Re: csv with parens that include commas
by Anonymous Monk on Dec 05, 2010 at 00:38 UTC
Re: csv'ish string with parens that include commas
by james2vegas (Chaplain) on Dec 05, 2010 at 03:57 UTC
    Use Regexp::Common::list, something like this (presuming your strings and all word characters):
    use Regexp::Common 'list'; my $re = qr{ \w+ # word characters (word or function name) (?:\($RE{list}{-pat=>'\w+'}\))? # (optional parameters) }x; my $x = 'joe,sam,x(monkey,lemur)'; my @x = ($x =~ m{$re}g); print join("\n", @x);
    returns:
    joe sam x(monkey,lemur)

      Thanks. Actually, some of the args will be numbers, but I should be able to take it from here.

      I also looked at using Text::Balanced for a bit, but it seemed like using a hammer to drive a screw.

        \w matches letters, digts and underscore, so that should be sufficient.