The ways to reverse lines get a bit messy because they involve merging segments together (provided of course as your problem statement says that you can't fit all lines into memory at once). As it turns out well written system sort routines handle huge files and all this merging can split the job up into smaller pieces.

My suggestion would be to make a pass thorough this huge file, adding a number in the first column while generating a new file. Then use a system sort on that file using this first column. Then make a pass through the sorted file to remove this "index" number that was added in the first pass. And you have a reversed file.

This idea may sound "stupid", but you may be surprised at how fast this can work. It is dependent upon a lot of things. But you can write the code in some number of minutes and let it run while you are pondering more efficient ways. And of course there are more efficient ways!

I don't want to get too involved in the more complex ways until I hear back about how fast and what limitations were involved with this "stupid idea".

I just propose that you write some simple code and launch it off on its job, while thinking about other ways.

Update: This "tac" idea is also good. It may or may not be faster, depending upon how it is implemented. The main point is try the simple things first.


In reply to Re: reversing files line by line by Marshall
in thread reversing files line by line by naturalsciences

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.