in reply to Tabulating Data Across Multiple Large Files
One method that might be useful (hard to tell from your description) is to merge all of the files into one (using sort on Unix or sort -m if they are already sorted). Then all the keys that match will be together in the file and it is simple to write a program to process them. You may need to pre-process the files to make them suitable for this method.
Updated:Yes, now that I see your data, it looks like this method would be appropriate. You can sort -t, -k 1n,2 -k 4,5 file1 file2 file3 ... > sorted. Then it's a simple matter to process the sorted file:
#!perl -w use strict; my @sum = (); my $prevkey =""; while (<>) { chomp; my @data = split /,/, $_; next if $data[0] == 1; # skip headers my $key = join(",", @data[0, 1, 3, 4]); if ($key eq $prevkey) { for (0 .. $#data - 5) { $sum[$_] += $data[$_ + 5] } } else { dumpsums(); $prevkey = $key; @sum = @data[5 .. $#data]; } } dumpsums(); sub dumpsums { if ($prevkey) { print "$prevkey,", join(",", @sum), "\n"; } }
|
|---|