An option you didn't list is: Create an index. ( Oops, I guess that's an implementation of option 4. ) Basically, that's what Tie::File does, but Tie::File builds the index in memory whereas you'll be building it in a file.

There are two advantages: Tie::File would easily require >1GB of memory in your situation, and if the index resides on disk, you build it once and use multiple times, as long as the data file doesn't chage.

Read through the huge file writing the starting locations of every line into another file (in fixed-width binary, such as pack('N2', $high, $low) or pack('Q', $addr)).

Then, you can seek to random (divisible by 4) locations into the index file to know where to seek into the data file.

Note: Perl IO is supposedly quite slow — was this fixed? — so you could write a tiny C program to create the index file since that's pure IO and just as simple to write in C as in Perl.

Note: This would require your Perl, its seek and its tell to handle 64-bit numbers.


In reply to Re: Help performing "random" access on very very large file by ikegami
in thread Help performing "random" access on very very large file by downer

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.