in reply to Splitting a comma-delimited string where a substring could contain commas

I need to split a string by commas, but excluding any commas between parethesis

Here's a start, which doesn't use lookahead assertions. It works on your tests case, but I would throw more tests cases at it before putting it into production.

local $_ = "this, that, those, these (not enough, nope, never), there" +; while ( /(?:^|, )([^,]+\(.*?\)|[^,]+)/g ) { print $1, "\n"; }
You have to understand a bit about backtracking to get how this works. It proceeds by trying to match, in this order
  1. at the beginning of a string, a word followed by a parenthetical
  2. at the beginning of a string, a word
  3. following ", ", a word followed by a parenthetical
  4. following ", ", a word

  • Comment on Re: Splitting a comma-delimited string where a substring could countain commas
  • Download Code

Replies are listed 'Best First'.
Re: Re: Splitting a comma-delimited string where a substring could countain commas
by samtregar (Abbot) on May 03, 2002 at 19:09 UTC
    That looks pretty good, but it doesn't deal with multiple levels of parens. I think Text::Balanced is really the better solution.

    -sam

      ... but it doesn't deal with multiple levels of parens.

      Coding now to deal with nested parens would be solving a problem that hasn't been presented. There might or might not be nested parens in the data. I'd wait for the "customer" to clarify their requirements before hitting this with a larger hammer. YMMV.

      local $_ = "this, (that, those), these ((not enough, (nope)), never), +there"; (my $re=$_)=~s/((\()|(\))|.)/${[')','']}[!$3]\Q$1\E${['(','']}[!$2]/gs +; $re= join'|',map{quotemeta}eval{/$re/}; die $@ if $@ =~ /unmatched/; while( /((?:$re|[^,])*)/g ){ print "$1\n"; }