in reply to Creative sorting and totalling of large flatfiles (aka pivot tables)
For more complex CSV files, try Text::xSV, which handles the full CSV grammar.open LOG, "<$log_file" or die "Could not open $log_file\n"; my %ip_count; while (<LOG>) { my ($ip, $severity, $date, ...) = split /\s*,\s*/; $ip_count{$ip}++; # other summary stat calcs below }
For an athlon xp2100 system and a gig of memory, most stats calculations with 1-10x10^6 records typically took 1-10 minutes. Even 30 of these will only take a few hours. As long as you process one line at a time and and have enough RAM to hold your hashes, calculations should go quickly.
-Mark
|
|---|