For a file of 4Gb I need an array that uses 32bits to save each position. 4 byte * n_lines will use to much memory! I have +- 150.000.000 of lines, so is at least 572Mb just to load the positions, than how I handomize 150.000.000 of entries, since is something similar to sort all this entries?!

But what we forget is that an Array in Perl is an array of SCALARs, so, we will use much more memory for each position than just 4 bytes. And randomize this array in Perl will copy the array in the memory and use more meory and will be very slow!


In reply to Re^2: Randomizing Big Files by Anonymous Monk
in thread Randomizing Big Files by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.