in reply to Threads in Perl: Just leaky?
So, I'm modifiying a perl script I wrote for multi-threading. It takes about 30 minutes to run using 1-2% of the CPU on our machines, and with 10 threads doing the work at the same time it uses 10-20% of the CPU.... but it runs out of memory.
What the script did originally is read about 20000+ files, parse them for certain bits of information, and put that into a hash of array of hashes of arrays of hashes so that XML::Simple could output the relevant information into a neat 7.5MB xml file (later loaded into another script).
Without even reading further than this, the answer that pops into my head is 'Use a database!'
It's all very well to stash stuff into a hash, but (as you've discovered) this doesn't scale well. That's one of the things that databases are great for -- they take care of filing that stuff away for you, then giving it back later.
If you're leery of setting up a database, I can highly recommend SQLite as a solution. Tiny and powerful, there's nothing to configure. Just point it to a database and start adding stuff. It just works, and performance is fantastic.
|
|---|