hi all, i have a problem that i need urgent solutions for. i am doing some calculations along which i'm building up a huge hash, estimated to be a few GB. The hash consists of roughly hundreds of thousands of keys, each key pointing to an array of values, and i'm just appending values to these arrays pointed by the hash key while i do my calculation. This of course cannot be handled by the memory of a normal PC.
i then tried to tie the hash to a file in the hope of not using up all the memory, but it becomes unacceptably slow. based on my limited understanding, this is a classical memory vs. speed issue.
i then thought of the following: do a few steps of calculations at a time, and output a file that writes the sorted hash (by key) based on these calculations. so i end up with a few 500MB files, each storing part of the huge hash with the keys sorted.
now my question is, how to merge all these files without exhausting all my memory, or taking a month to complete?
i have kept a separate master_hash that only contains the sorted keys of the huge hash without its values. if i tie to one small hash file at a time and extract values of the keys in order, it's way too slow. may i have some suggestions please? tried all i can think of but it's still taking more than a week just to put together one huge hash. many thanks!!
In reply to how to merge many files of sorted hashes? by andromedia33
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |