Counting items in a CSV

zyzzogeton has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Counting items in a CSV by ikegami (Patriarch) on Jun 22, 2009 at 16:49 UTC
Yup, hashes are great for grouping. But since you have two data points for each date (count and size), you'll need of some kind of 2d structure. I used a hash of hash in the following: `use Text::CSV_XS qw( ); my %size_by_date; my $csv = Text::CSV_XS->new(); while ( my $row = $csv->getline($fh) ) { my ($name, $date, $size) = @$row; ++$size_by_date{$date}{count}; $size_by_date{$date}{size} += $size; } die("csv parse error: " . $csv->error_diag() . "\n") if !$csv->eof(); for my $date (keys %size_by_date) { my ($count, $size) = @{ $size_by_date{$date} }{qw( count size )}; print("$date: $count, $size\n"); }` [download]	[reply] [d/l]
Re^2: Counting items in a CSV by zyzzogeton (Initiate) on Jun 22, 2009 at 17:21 UTC
Oh wow. A 2d structure seems so obvious now. I swear I bang my head against the wall sometimes cause it feels so good when I stop.	[reply]
Re: Counting items in a CSV by perliff (Monk) on Jun 22, 2009 at 18:04 UTC
For a large number of columns in a tab-delimited or a csv file, its quite easy to use existing modules to get the information you want. If files are nicely structured, regardless of the number of columns, you can use the Data::CTable module. When combined with the Statistics::Descriptive module, you can get much more information from your data... the let's say your data is like this... `name,date,size name1,date1,120 name2,date2,140 name3,date3,150` [download] well here's some code, hopefully easy enough to understand... `use strict; use Data::CTable; use Statistics::Descriptive; my $data = Data::CTable->new("data.txt"); # your csv file $data->clean_ws(); # clean up whitespace my $sizecolumn = $data->col('size'); # get column by name my $stat = Statistics::Descriptive::Full->new(); $stat->add_data($sizecolumn); print "sum of the column size:", $stat->sum() , "\n";` [download] Its up to you, you can use the Statistics::Descriptive module to get much more information (sum, mean, median standard deviation etc) from your data as well, or maybe if you just need a simple sum you can add the elements of the array yourself. perliff ---------------------- -with perl on my side	[reply] [d/l] [select]
Re: Counting items in a CSV by bichonfrise74 (Vicar) on Jun 22, 2009 at 23:52 UTC
Another possible solution... #!/usr/bin/perl use strict; use Text::CSV; my %hash; my $csv = Text::CSV->new(); while( my $line = <DATA>) { if ( $csv->parse($line) ) { my @columns = $csv->fields(); next if ( $columns[0] eq "Name" ); $hash{$columns[1]}->[0]++; $hash{$columns[1]}->[1] = exists $hash{$columns[1]}->[1] ? $hash{$columns[1]}->[1] + $columns[2] : $columns[2]; } } print "$_ -- $hash{$_}->[0] -- $hash{$_}->[1]\n" for ( keys %hash ); __DATA__ "Name","Date","size" "Name One","05/19/2009","151397376" "Name Two","05/19/2009","123333441" "Name One","05/20/2009","183439993" "Name Three","05/20/2009","8098123089" [download]	[reply] [d/l]