I also am very curious what you did to make Text::CSV_XS be 4 time slower than your code.
For what it is worth, I tried to come up with the most efficient Text::CSV_XS code to attack your situation, only to find there was a bug in the module, which is now fixed in version 1.19. The code would be something like this:
use 5.20.0; use warnings; use Text::CSV_XS; my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1, keep_meta_info => 11, }); my $row = $csv->getline (*DATA); # get the first line to count number +of fields my @row = @$row; $csv->bind_columns (\(@row)); # speed up all remaining fetches do { $row[$_] =~ tr{,}{-} for grep { $csv->is_quoted ($_) } 0..$#row; $csv->say (*STDOUT, \@row); } while ($csv->getline (*DATA)); __END__ 65722417,"1193,1",7980,1133566,4169735,035,FEDERAL UNIVERSAL SERVICE F +UND,0.12998 65722417,"1193,1",1012,1132900,4150053,C2,Carrier Cost Recovery Fee,0. +0273
$ perl test.pl 65722417,"1193-1",7980,1133566,4169735,035,"FEDERAL UNIVERSAL SERVICE +FUND",0.12998 65722417,"1193-1",1012,1132900,4150053,C2,"Carrier Cost Recovery Fee", +0.0273
Not being dynamic, not forcing fields with a space to be quoted, and using a file instead of data, that would be:
use 5.20.0; use warnings; use Text::CSV_XS; my $csv = Text::CSV_XS->new ({ binary => 1, auto_diag => 1, quote_space => 0, keep_meta_info => 11, }); open my $io, "<", "test.csv"; my @row = ("") x 8; # The CSV has 8 columns $csv->bind_columns (\(@row)); while ($csv->getline ($io)) { $row[$_] =~ tr{,}{-} for grep { $csv->is_quoted ($_) } 0..$#row; $csv->say (*STDOUT, \@row); }
I'd be very interested in how much worse that performs on big datasets over your code.
In reply to Re: Modifying CSV File
by Tux
in thread Modifying CSV File
by roho
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |