I have the same problem when I want to split CSV files into its columns. The point I got is that I couldn't do it with a regexp, so I did it the way it would be done without regexps...
Basically, I read character by character and push the contexts I enter... For instance....
"This is a column","Yes, I know",12323,23123.23,"This is a \"column\""
should be splitted in
This is a column Yes, I know 12323 23123.23 This is a "column"
ok, no more talking... the code talks by itself...
#!/usr/bin/perl -w use strict; my $origin = '"This is a column","Yes, I know",123123,23123.23,"This i +s a \"column\""'; my @cols = parse_line($origin); print join("\n", @cols)."\n"; sub parse_line { my $line = shift; my @contexts; my $context = ""; my $column; my @cols; my $string_delim = '"'; my $escape_char = "\\"; my $field_delim = ','; for (my $i = 0; $i < length $line; $i++) { my $c = substr($line, $i, 1); if ($c eq $string_delim) { if ($context eq "string") { $context = shift @contexts; } elsif ($context eq "escape") { $column .= $c; $context = shift @contexts; } else { push @contexts, $context; $context = "string"; } } elsif ($c eq $escape_char) { if ($context eq "escape") { $column .= $c; $context = shift @contexts; } else { push @contexts, $context; $context = "escape"; } } elsif ($c eq $field_delim) { if ($context eq "string") { $column .= $c; } elsif ($context eq "escape") { $column .= $c; $context = shift @contexts; } else { push @cols, $column; undef $column; } } else { $column .= $c; } if ($i == length($line) - 1) { push @cols, $column; undef $column; } } return @cols; }
In reply to Re: split $data, $unquoted_value;
by ruoso
in thread split $data, $unquoted_value;
by Ovid
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |