in reply to csv parsing with multiple missing values/multiple commas

The new Text::CSV_XS has callbacks for that:

my $aoh = csv (in => $xFile[$k], headers => "auto", callbacks => { after_parse => sub { $_ ||= 0 for @{$_[1]} }});

As an example:

$ cat test.pl #!/pro/bin/perl use 5.16.2; use warnings; use Text::CSV_XS qw(csv); use Data::Peek; DDumper (csv (in => *DATA, headers => "auto", callbacks => { after_parse => sub { $_ ||= 0 for @{$_[1]} }})); __END__ a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z,A,B,C,D,E,F,G,H,I, +J,K 11004516,0,0,9,9,3,12477,,,4,,0,,,3,38a947a1,b66b7850,6a14f9b9 1100699 +5,1,,-1,,,,, $ perl test.pl [ { A => 0, B => 0, C => 0, D => 0, E => 0, F => 0, G => 'fbc55dae', H => '9a89b36c', I => '58e67aaf', J => 'f600ec0b', K => 0, a => 11004516, b => 0, c => 0, d => 9, e => 9, f => 3, g => 12477, h => 0, i => 0, j => 4, k => 0, l => 0, m => 0, n => 0, o => 3, p => '38a947a1', q => 'b66b7850', r => '6a14f9b9 11006995', s => 1, t => 0, u => -1, v => 0, w => 0, x => 0, y => 0, z => 0 } ]

update: as Parse::CSV uses Text::CSV_XS underneath, you can just add the callback to your code if you have a recent enough version of Text::CSV_XS.

my $csv = Parse::CSV->new ( file => $xFile[$k], header => "auto", names => 1, empty_is_undef => 1, auto_diag => 1, callbacks => { after_parse => sub { $_ ||= 0 for @{$_[1] } }, ));


Enjoy, Have FUN! H.Merijn

Replies are listed 'Best First'.
Re^2: csv parsing with multiple missing values/multiple commas
by f77coder (Beadle) on Aug 02, 2014 at 15:36 UTC
    Hello, Thanks for your comment. I did have to update my version of Text::CSV_XS but I still get the errors.
      Try this ;
      #!perl use 5.12.0; use warnings; use strict; use File::Basename; use Text::CSV_XS; use Benchmark; my $start = time; my $t0 = new Benchmark; print "\n Current Date and Time -> " . localtime() . "\n"; my $Base = 'c:/temp'; my $s_DIR=$Base.''; my $p_DIR=$Base.''; my $f_DIR=$Base.''; my @xFile = grep {-f $_} glob( "$s_DIR/x*.csv"); for my $csvfile (@xFile){ # input my ($name,$dir,$ext) = fileparse($csvfile, qr/\.[^.]*/ ); print "processing $name \n"; open my $IN,'<',$csvfile or die "Could not open $csvfile : $!"; # outputs my $f_Pass = $p_DIR."/pass_table_".$name.'.txt'; my $f_Fail = $f_DIR."/fail_table_".$name.'.txt'; open my $PASS,'>',$f_Pass or die "Can't open output file $f_Pass : $!"; open my $FAIL,'>',$f_Fail or die "Can't open output file $f_Fail : $!"; # process my $indexF = 0; my $indexP = 0; my $csv = Text::CSV_XS->new({ auto_diag => 1, binary => 1, callbacks => { after_parse => sub { $_ ||= 0 for @{$_[1] } }, } }); my $header = $csv->getline($IN); while ( my $colref = $csv->getline($IN) ){ my $col0 = shift @$colref; my $col1 = shift @$colref; if( $col1 == 1 ){ print $PASS join " ",@$colref,"\n"; $indexP = $indexP + 1; } else { print $FAIL join " ",@$colref,"\n"; $indexF = $indexF + 1; }; }; # report print " totalP $indexP totalF ".($indexF-0)." total ".($indexP+$inde +xF)." \n"; printf "%% totalP/(totalF+totalP) = %.2f %% \n",($indexP/($indexP+$i +ndexF)*100); close $PASS or die "Couldn't close output file $f_Pass : $!"; close $FAIL or die "Couldn't close output file $f_Fail : $!"; }; my $t1 = new Benchmark; my $td = timediff($t1, $t0); printf "\nCode took: %s\n",timestr($td); printf "++Finished program in -> %5.2f seconds\n\n",time-$start;
      poj

        Wow, this works great.

        You switched to Text::CSV_XS instead of Parse::CSV. Were there that many problems with my implementation or do you think there is a bug in Parse::CSV?

        Again many thanks