Re (tilly) 1: csv output
by tilly (Archbishop) on Mar 14, 2002 at 19:05 UTC
|
Please do not go with the existing wrong answers. Here is a description of the basic CSV spec as implemented in most Microsoft products:
- Rows are delimited by returns. (\r\n or \n depending on the platform, binmode, etc.)
- Fields within a row are delimited by ",". (When saving in "text" format the separator is often "\t" instead.)
- Fields may be quoted or unquoted.
- Quoted fields are literal text that start and end with an unpaired ". Separators, returns, etc can appear within a quoted field, and " can appear doubled.
- Unquoted fields cannot contain the separator, returns, or quotation marks. They are also subject to some interpretation. For instance numbers may appear in floating point, and an empty field is a null (represented within Perl by undef - very few parsers get this right).
- It is customary for the first row to be the field names, and for all rows to have the same number of fields.
With that in mind, here is a snippet to format a row:
# Takes an array and returns it as a CSV row
sub format_csv {
my @fields = @_;
foreach (@fields) {
if (not defined($_)) {
$_ = "";
}
elsif (0 == length($_)) {
$_ = '""';
}
elsif (/\s|"|'|,/) {
s/"/""/g;
$_ = qq("$_");
}
}
(join ",", @fields) . "\n";
}
With that function, supposing that $file was a file you wanted to write, @cols an array of columns that you wanted to put in a CSV file, and @data was an array of hash references with your data (see References Quick Reference if you don't know what an array of hash references is), you could write it as follows:
local *FILE;
open (FILE, "> $file") or die "Cannot write '$file': $!";
print FILE format_csv(@cols);
foreach my $row (@data) {
print FILE format_csv(@$row{@cols});
}
close(FILE) or die "Cannot close '$file': $!";
Note that I have put in error checking as very wisely recommended in perlstyle... | [reply] [d/l] [select] |
|
|
"([^"\\]*(\\.[^"\\]*)*)",?|([^,]+),?|,
Would that introduce Microsoft-incompatability?
U28geW91IGNhbiBhbGwgcm90MTMgY
W5kIHBhY2soKS4gQnV0IGRvIHlvdS
ByZWNvZ25pc2UgQmFzZTY0IHdoZW4
geW91IHNlZSBpdD8gIC0tIEp1ZXJk
| [reply] [d/l] |
|
|
Save the following in your text editor and open in Excel to verify the incompatibility.
hello,world
"this","is "",test 1"
"this","is \",test 2"
"this","is "" test 3"
"this","is \" test 4"
Note in particular the unusual handling of the even tests. | [reply] [d/l] |
|
|
|
|
|
Re: csv output
by strat (Canon) on Mar 14, 2002 at 17:45 UTC
|
There are several possibilities to do so:
You could do it with the Modules DBI and DBD::Csv, (treated as database), or Text::Csv, or do it by yourself.
The rules are about: The columns are separated with a special char; if this char is inside a column, then the column is embraced by "" (or sometimes '').
my $csvSep = ','; # comma as csv-separator
foreach my $line (@rows){ # e.g. 2-dim array
my @columns = @$line;
print join($csvSep, # join by $csvSep
map {
/\Q$csvSep/ ? "\"$_\"" : $_ # if $csvSep found in Value: "val"
} @columns; # take columns
), "\n"; # and a newline...
} # foreach
Best regards,
perl -le "s==*F=e=>y~\*martinF~stronat~=>s~[^\w]~~g=>chop,print" | [reply] [d/l] |
|
|
You at least put some effort into this answer, but your code is not sufficient to produce valid output if, for instance, the contents of a field was just a double-quotation mark.
| [reply] |
Re: csv output
by strat (Canon) on Mar 15, 2002 at 14:56 UTC
|
I've just found a new Module: Text::CSV_XS, which is a newer and faster solution than Text::CSV. It is partially implemented in C.
use Text::CSV_XS;
my $csv = Text::CSV->new;
# reading
$csv->parse(<STDIN>);
my @fields = $csv->fields;
#or writing:
$csv->combine(@fields);
print $csv->string;
I found this module via "Data Munging" from David Cross
Best regards,
perl -le "s==*F=e=>y~\*martinF~stronat~=>s~[^\w]~~g=>chop,print" | [reply] [d/l] |
Re: csv output
by oubiwann (Sexton) on Mar 15, 2002 at 06:19 UTC
|
Hey everybody, maybe we need to lighten things up a little bit in here?
Toni, first thing: relax :-)
Second, there's more than one way to do it ;-)
Third, maybe we could be more helpful if we know what you are trying to do, what your limitations are, etc.
For instance, do you need to use Excel? Are you open to another method? What programs are going to be accessing the CSV?
Update!: This was not a slam on Toni! I was trying to make this person feel more comfortable in all the tension... I should have been more clear. Perhaps I should have said "first thing, it's all good :-)"
It's just that there's this Tibetan guy I know and he is always laughing and saying "relax..."
It's all good. | [reply] |
|
|
You know, I find this absolutely incredible:
I was given a negative marking on this post!
Is this not a community of Fellowship? Should we not work together to better ourselves? Was my intention not clearly one of helpfulness?
Why then, the need to lash out? Why can't it be a hand of friendship instead of a hand that seeks to damage?
| [reply] |
|
|
Perhaps its because you told Toni to chill out. Toni hasn't posted a single node in this thread except for the original question....
| [reply] |
|
|
|
|
|
|
|
|
|
|
|
| A reply falls below the community's threshold of quality. You may see it by logging in. |