in reply to Re: A better way to split CSV files with quoted strings that may contain commas?
in thread A better way to split CSV files with quoted strings that may contain commas?

Wow! What a timely tidbit of information. I was just pondering the same problem and your link was most helpful. But as I discovered from the FAQ link, the example doesn't handle extra spacing (around commas) very well.

Here's my 'space fixed' version, with checking for single quotes as well:
use strict; use warnings; # crazy mix of quoting my $test1 = q/, "test with space ",,, mary had , a,, 'cake, with +cheese' , and, "a, \"little" , lamb chop , /; # original FAQ example my $test2 = q/SAR001,"","Cimetrix, Inc","Bob Smith","CAM",N,8,1,0,7,"E +rror, Core Dumped"/; # throw some wild spaces in there my $test3 = q/SAR001, "", "Cimetrix, Inc", "Bob Smith", "CAM",N, 8 ,1, +0 ,7, "Error, Core Dumped"/; # finally a nearly empty string my $test4 = q/,/; split_string($test1); split_string($test2); split_string($test3); split_string($test4); sub split_string { my $text = shift; my @new = (); push(@new, $+) while $text =~ m{ \s*( # groups the phrase inside double quotes "([^\"\\]*(?:\\.[^\"\\]*)*)"\s*,? # groups the phrase inside single quotes | '([^\'\\]*(?:\\.[^\'\\]*)*)'\s*,? # trims leading/trailing space from phrase | ([^,\s]+(?:\s+[^,\s]+)*)\s*,? # just to grab empty phrases | (), )\s*}gx; push(@new, undef) if $text =~ m/,\s*$/; # just to prove it's working print "string: >>$text<<\n"; foreach (@new) { print " part: >>" . (defined($_) ? $_ : '') . "<<\n"; } }
  • Comment on Re^2: A better way to split CSV files with quoted strings that may contain commas?
  • Download Code