The machine is an SGI Origin 16 processor HPC with 1 gig ram per processor. When running job accounting, nothing of note seems to happen. I understand a database is desired, and I intend to go that route eventually, but I would like to have both options - database and text files manipulation. | [reply] |
Nice hardware, but ... 1 GB/Processor? I know nothing of that hardware, but that (again) sounds like any given process will be limited to 1 GB minus any OS overhead. It very much depends upon the distribution of the contents of the file, but I could quite see the 8 million lines building a hash structure > 1 GB.
Most times when Perl run's out of ram on my machine I get Perl's "Out of memory" error, but occasionally I get a segfault.
Maybe your job accounting would be telling you if memory was a problem--I've not the vaguest clue what that might contain--and you can rule it out, but if you have access to a top-like live monitoring program, it would be worth checking it out.
I think I would try filtering the input file into smaller files, say by protocol (assuming there not all icmp) and then process those separately and combine the results.
You don't actually show what your doing with that monster structure. It looks like your just counting the number of dropped packets per
date/proto/dst:port/src:port.
If that is all your are doing, then there little point in building the deep structure. You would achieve the same results by concatenating all those values into a string and using a single level of hash.
That said, if the problem is memory, that may not help much.
Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
"Memory, processor, disk in that order on the hardware side. Algorithm, algoritm, algorithm on the code side." - tachyon
| [reply] |