in reply to Outputting Huge Hashes

What are those other hashes? As others suspect, they may be the problem. You could use Devel::Size's total_size function to inspect their growth if you suspect they are the problem. Do you get the right results in the output file? A lot of blank columns could indicate that you're making neew entries in those other hashes.

You could use tied hashes instead. Something such as DBM::Deep can store the data on disk instead of in memory so you don't suck up all of your RAM. You could also shove all of this into a real database server and get more fancy with the problem. :)

Good luck!

--
brian d foy <brian@stonehenge.com>
Subscribe to The Perl Review

Replies are listed 'Best First'.
Re^2: Outputting Huge Hashes
by bernanke01 (Beadle) on Jan 31, 2006 at 16:38 UTC

    Yup, I actually started off using DBM::Deep, but realized that my dataset could squeeze into memory and so switched away from that. The DBM::Deep implementation takes quite a bit longer to run than the in-memory version (30 days vs. 12 hours based on two trials).

    Also, the output (until the program crashes at least) looks perfectly normal. No blank or duplicate rows appear, its just as some point the system runs out of memory during output.