in reply to split $data, $unquoted_value;
I have the same problem when I want to split CSV files into its columns. The point I got is that I couldn't do it with a regexp, so I did it the way it would be done without regexps...
Basically, I read character by character and push the contexts I enter... For instance....
"This is a column","Yes, I know",12323,23123.23,"This is a \"column\""
should be splitted in
This is a column Yes, I know 12323 23123.23 This is a "column"
ok, no more talking... the code talks by itself...
#!/usr/bin/perl -w use strict; my $origin = '"This is a column","Yes, I know",123123,23123.23,"This i +s a \"column\""'; my @cols = parse_line($origin); print join("\n", @cols)."\n"; sub parse_line { my $line = shift; my @contexts; my $context = ""; my $column; my @cols; my $string_delim = '"'; my $escape_char = "\\"; my $field_delim = ','; for (my $i = 0; $i < length $line; $i++) { my $c = substr($line, $i, 1); if ($c eq $string_delim) { if ($context eq "string") { $context = shift @contexts; } elsif ($context eq "escape") { $column .= $c; $context = shift @contexts; } else { push @contexts, $context; $context = "string"; } } elsif ($c eq $escape_char) { if ($context eq "escape") { $column .= $c; $context = shift @contexts; } else { push @contexts, $context; $context = "escape"; } } elsif ($c eq $field_delim) { if ($context eq "string") { $column .= $c; } elsif ($context eq "escape") { $column .= $c; $context = shift @contexts; } else { push @cols, $column; undef $column; } } else { $column .= $c; } if ($i == length($line) - 1) { push @cols, $column; undef $column; } } return @cols; }
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: split $data, $unquoted_value;
by motobói (Beadle) on Sep 15, 2005 at 13:33 UTC | |
by ruoso (Curate) on Sep 15, 2005 at 17:09 UTC |