in reply to split $data, $unquoted_value;

I have the same problem when I want to split CSV files into its columns. The point I got is that I couldn't do it with a regexp, so I did it the way it would be done without regexps...

Basically, I read character by character and push the contexts I enter... For instance....

"This is a column","Yes, I know",12323,23123.23,"This is a \"column\""

should be splitted in

This is a column Yes, I know 12323 23123.23 This is a "column"

ok, no more talking... the code talks by itself...

#!/usr/bin/perl -w use strict; my $origin = '"This is a column","Yes, I know",123123,23123.23,"This i +s a \"column\""'; my @cols = parse_line($origin); print join("\n", @cols)."\n"; sub parse_line { my $line = shift; my @contexts; my $context = ""; my $column; my @cols; my $string_delim = '"'; my $escape_char = "\\"; my $field_delim = ','; for (my $i = 0; $i < length $line; $i++) { my $c = substr($line, $i, 1); if ($c eq $string_delim) { if ($context eq "string") { $context = shift @contexts; } elsif ($context eq "escape") { $column .= $c; $context = shift @contexts; } else { push @contexts, $context; $context = "string"; } } elsif ($c eq $escape_char) { if ($context eq "escape") { $column .= $c; $context = shift @contexts; } else { push @contexts, $context; $context = "escape"; } } elsif ($c eq $field_delim) { if ($context eq "string") { $column .= $c; } elsif ($context eq "escape") { $column .= $c; $context = shift @contexts; } else { push @cols, $column; undef $column; } } else { $column .= $c; } if ($i == length($line) - 1) { push @cols, $column; undef $column; } } return @cols; }
daniel

Replies are listed 'Best First'.
Re^2: split $data, $unquoted_value;
by motobói (Beadle) on Sep 15, 2005 at 13:33 UTC
    Ruoso, wouldn't Text::CSV and related modules (XS, PP) work for you? I suppose you know this module, so what am I missing?
    #!/usr/bin/perl -w use Text::CSV_XS; my $csv = Text::CSV_XS->new( { 'escape_char' => '\\' } ); my $line = '"This is a column","Yes, I know",12323,23123.23,"This is a + \"column\""'; if ( $csv->parse($line) ) { print join "\n", $csv->fields, "\n"; } else { print "Could not parse line\n", $csv->error_input, "\n"; }
    motobói

      Well...

      Good question ;)...

      Maybe I thought it was easier to write the code then looking for a module... how dumb I am... anyway, it was fun to write it... :)

      daniel