You have to put one of the two files into a hash, it doesn't really matter which one.

Actually, there's a good chance that it does matter. If one file has about 2 million rows/keys, and the other has about 8 million, it will take noticeably less resources and time to store the keys of the smaller file into a hash. As GrandFather suggested above, there's a reasonable chance that a hash of 2 million elements could fit into RAM without causing the machine to flail due to the virtual memory content being bounced back and forth between RAM and swap file.

But whether it's in-memory or in a DBM file of some sort, creating 2 million keys will be quicker than 8 million (and it would just seem to make more sense). Of course, once a hash has been built, access time is not likely to differ all that much (except when an "in-memory" hash is big enough to induce swapping), but the time/space needed to build the hash may differ significantly depending on the quantity of elements involved.


In reply to Re^2: Searching Huge files by graff
in thread Searching Huge files by biomonk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.