Felow monks,

I have chalange.

How can I randomize the lines of a very big file using just pure Perl?

I have here a file of 4Gb, and I need to randomize this file to be able to distribute this lines over many process in a way that we don't create a pattern of the order of the entries in the line, making a balanced distribution of the data that will be used to make searches. Since if I have all the process with random entries I will have a system making more balanced searches of the information and this also won't overload the main DB grid where the search is done, since each type of search is distributed over different servers int the grid.

Randomize a big file is a big problem, since I need to work with all the lines in the file and I can't try to make a solution that randomizes by small blocks, since a line need to be distributed in all the range of the file, and not inside each block. Also I can't load all the file in the memory, even inside an array, or send to a DB since a DB will make the process toooo sloooww.

So, how to do that with pure Perl, fast and with less memory?!

By gmpassos.


In reply to Randomizing Big Files by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.