If the comparision you are doing doesn't involve any complex transformations of the data structure or really time-consuming math then your CPUs will mostly sit around looking at the daisies while they are waiting for your hard disk to deliver the data. Disk I/O is slow, REALLY slow, compared to the speed of your memory or CPUs

So no matter how many CPUs you have to do the job, the only thing that probably matters in your case is how fast your disk (or disks) can read the data (and what algorithm you are using)

And if the hashes are so big that they don't fit into the RAM memory your machine starts to swap, i.e. it puts part of its memory contents back onto the hard disk which makes you even more dependent on hard disk speed. This swapping usually leads to your program doing nothing anymore except swapping, this is called 'thrashing'.

So your solution might be, depending on your circumstances:
1) Buy a faster hard disk or use a raid
2) Do some preprocessing of your data so that it takes up less space
3) Buy more RAM
4) Use a database for one of the huge files and compare the second one by accessing the database.
5) Depending on your data use some algorithm that avoids reading in the two files completely into memory, for example through a merge sort


In reply to Re^3: changing parameters in a thread by jethro
in thread changing parameters in a thread by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.