in reply to Tabulating Data Across Multiple Large Files
(Caution: Untested.)my @fields = qw( 1 Case Iter Fusion Type Tanks AFVs ADAs IFVs UAVS Unknown Total Latency Decoys FalseNeg FalsePos ); my @key_fields = @fields[0,1,3,4]; my @data_fields = @fields[2,5..15]; my %n; # key="@key_field_vals"; val = count my %r; # key="@key_field_vals"; val = hashref: key=field, val=sum (a +nd later, average) while (<>) # read all files, in sequential order (not in parallel) { chomp; my %rec; @rec{@fields} = split /,/; my $key = join ",", @rec{@key_fields}; $n{$key}++; for my $f ( @data_fields ) { $r{$key}{$f} += $rec{$f}; } } # now each $r{$key}{$f} is the sum of that column for that key # convert them to averages. for my $key ( keys %r ) { for my $f ( sort keys %{$r{$key}} ) { $r{$key}{$f} /= $n{$key}; } } # now you can convert the results to normal-looking records: my @averages; # one per unique key-vector value. for ( keys %r ) { my %rec; @rec{ @key_fields } = split /,/; @rec{ keys %{$r{$_}} } = values %{$r{$_}}; push @averages, \%rec; } # now @averages is an array of records that look exactly like # the data rows you read in, except that the data column values # are averages, and the key field value vectors are unique.
jdporter
The 6th Rule of Perl Club is -- There is no Rule #6.
|
|---|