in reply to Modifying CSV File
I also am very curious what you did to make Text::CSV_XS be 4 time slower than your code.
For what it is worth, I tried to come up with the most efficient Text::CSV_XS code to attack your situation, only to find there was a bug in the module, which is now fixed in version 1.19. The code would be something like this:
use 5.20.0; use warnings; use Text::CSV_XS; my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1, keep_meta_info => 11, }); my $row = $csv->getline (*DATA); # get the first line to count number +of fields my @row = @$row; $csv->bind_columns (\(@row)); # speed up all remaining fetches do { $row[$_] =~ tr{,}{-} for grep { $csv->is_quoted ($_) } 0..$#row; $csv->say (*STDOUT, \@row); } while ($csv->getline (*DATA)); __END__ 65722417,"1193,1",7980,1133566,4169735,035,FEDERAL UNIVERSAL SERVICE F +UND,0.12998 65722417,"1193,1",1012,1132900,4150053,C2,Carrier Cost Recovery Fee,0. +0273
$ perl test.pl 65722417,"1193-1",7980,1133566,4169735,035,"FEDERAL UNIVERSAL SERVICE +FUND",0.12998 65722417,"1193-1",1012,1132900,4150053,C2,"Carrier Cost Recovery Fee", +0.0273
Not being dynamic, not forcing fields with a space to be quoted, and using a file instead of data, that would be:
use 5.20.0; use warnings; use Text::CSV_XS; my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1, quote_space => 0, keep_meta_info => 11, }); open my $io, "<", "test.csv"; my @row = ("") x 8; # The CSV has 8 columns $csv->bind_columns (\(@row)); while ($csv->getline ($io)) { $row[$_] =~ tr{,}{-} for grep { $csv->is_quoted ($_) } 0..$#row; $csv->say (*STDOUT, \@row); }
I'd be very interested in how much worse that performs on big datasets over your code.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Modifying CSV File
by roho (Bishop) on Jun 17, 2015 at 19:53 UTC | |
by Tux (Canon) on Jun 18, 2015 at 06:46 UTC | |
by roho (Bishop) on Jun 18, 2015 at 09:55 UTC | |
by Tux (Canon) on Jun 18, 2015 at 12:41 UTC | |
by roho (Bishop) on Jun 19, 2015 at 15:38 UTC |