hi all, i have a problem that i need urgent solutions for. i am doing some calculations along which i'm building up a huge hash, estimated to be a few GB. The hash consists of roughly hundreds of thousands of keys, each key pointing to an array of values, and i'm just appending values to these arrays pointed by the hash key while i do my calculation. This of course cannot be handled by the memory of a normal PC.

i then tried to tie the hash to a file in the hope of not using up all the memory, but it becomes unacceptably slow. based on my limited understanding, this is a classical memory vs. speed issue.

i then thought of the following: do a few steps of calculations at a time, and output a file that writes the sorted hash (by key) based on these calculations. so i end up with a few 500MB files, each storing part of the huge hash with the keys sorted.

now my question is, how to merge all these files without exhausting all my memory, or taking a month to complete?

i have kept a separate master_hash that only contains the sorted keys of the huge hash without its values. if i tie to one small hash file at a time and extract values of the keys in order, it's way too slow. may i have some suggestions please? tried all i can think of but it's still taking more than a week just to put together one huge hash. many thanks!!


In reply to how to merge many files of sorted hashes? by andromedia33

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.