Re: swapping PIPE for comma in CSV file

Replies are listed 'Best First'.
Re^2: swapping PIPE for comma in CSV file by sgt (Deacon) on Jun 27, 2007 at 14:45 UTC
For the OP just to emphasize the importance of binary in the case you have a def. of CSV that permits embedded newlines. For Merijn. I was going to answer more or less the same to the OT yesterday, but came across a few problems, that made me reinstall the latest versions... One problem was that I used IO::Wrap objects for stdin and stdout and they don't work with the pure perl version, I am not sure why. Maybe it would be better to load IO::Handle directly and have something for those who want efficiency. In this thread I wanted to test the pure perl version as installing an XS module could have been problematic for thew OP. I think that keeping in sync both versions is important... for some reason search.cpan.org gives the version 0.29 Text::CSV_XS but perl -MCPAN -e install qw(Text::CSV_XS)' installs 0.30 the right one I believe if I remember your post on p5p or pm. % steph@apexPDell2 (/home/stephan) % % cat conv_comma2pipe_xs.px #!/usr/bin/perl use strict; use warnings; $\|++; #use IO::Handle; use IO::Wrap; use Text::CSV_XS; # use DDS; # my $in = IO::Wrap::wraphandle(\STDIN) or die; # my $out = IO::Wrap::wraphandle(\STDOUT) or die; # Dump\($in, $out); my $csv_in = Text::CSV_XS->new({ binary => 1, }) or die; my $csv_out = Text::CSV_XS->new({ binary => 1, sep_char => q{\|}, eol => qq{\n}, }) or die; while (defined (my $rec = $csv_in->getline(\STDIN)) ) { { my @fields = @$rec; local $"=q{][}; print {\STDERR} ".rec [@fields]\n"; } $csv_out->print(\STDOUT, $rec); } __END__ % steph@apexPDell2 (/home/stephan) % % cat hi1.csv \| perl+ -w conv_comma2pipe_xs.px .rec [a][b][c] a\|b\|c .rec [a][okay, comma][c] a\|"okay, comma"\|c .rec [a][long line, indeed][end] a\|"long line, indeed"\|end % steph@apexPDell2 (/home/stephan) % % cat hi1.csv a,b,c a,"okay, comma",c a,"long line, indeed",end [download] cheers --stephan p.s I tested on cygwin with perl 5.8.7 and 5.8.8 update:* oops forgot the code...	[reply] [d/l]
Re^3: swapping PIPE for comma in CSV file by Tux (Canon) on Jun 28, 2007 at 07:00 UTC
One problem was that I used IO::Wrap objects for stdin and stdout and they don't work with the pure perl version, I am not sure why. Maybe it would be better to load IO::Handle directly and have something for those who want efficiency. In this thread I wanted to test the pure perl version as installing an XS module could have been problematic for thew OP. I think that keeping in sync both versions is important... The maintainer of Text::CSV_PP is doing a real nice job in trying to keep it in sync with Text::CSV_XS and we do have (a lot) of contact about that. I already had a look at version 1.06, and it passed all tests for 0.30, except the diagnostics tests, which is logical and explainable. That maintainer also got the maintainership for the very old Text::CSV, which will be a wrapper around Text::CSV_XS and Text::CSV_PP and choose the one available, based on a method used in DBI::PurePerl: the environment variable `TEXT_CSV_XS`, and will default to the fastest method available. I have been thinking about the use of IO::Handle, making it either default, or `use`'d automatically, but everything I came up with so far will imply a slowdown, which is IMHO unacceptable. I't a bit of a shame that this is a relative expensive module to load (14 kb of source code). for some reason search.cpan.org gives the version 0.29 Text::CSV_XS but perl -MCPAN -e install qw(Text::CSV_XS)' installs 0.30 the right one I believe if I remember your post on p5p or pm. Maybe I've been working too hard lately on this module, and uploaded too many versions :) Give CPAN some time to sync around the world. I'll have a look at the IO::Wrap thingy Enjoy, Have FUN! H.Merijn	[reply] [d/l] [select]