First of all, you're right: an index is probably a bad way to solve your problem. It's still possible, (for example, you could generate the index on disk rather than in memory if you needed to), but other monks have already posted other ideas that you might want to try first.

But what we forget is that an Array in Perl is an array of SCALARs, so, we will use much more memory for each position than just 4 bytes. And randomize this array in Perl will copy the array in the memory and use more meory and will be very slow!

This is off topic, but there's a more efficient way to handle very large arrays in Perl. You don't have to use normal arrays: you can construct a single string of packed bytes (using the pack() command), and access the individual elements using the vec() and unpack() commands.

Also, randomizing the array doesn't need to use a copy of the array. Instead, you can randomize your array "in place", by swapping each array element with a random array element, as you loop over the array. I believe this technique is called the "Fischer-Yates shuffle".

-- AC


In reply to Re^3: Randomizing Big Files by Anonymous Monk
in thread Randomizing Big Files by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.