merlyn's works if you can fit all of the data into (virtual) memory (and is the fastest unless you swap too much). fundflow's works if reading a single file that is seekable. Albannach's works if reading from more than one seekable file specified on the command line (and you are sure files won't be renamed, for example, until after the program finishes). None of them works for all cases. Each of them is an acceptable solution for a large set of problems. So you'll need to decide what types of problems you plan to solve.

If you are pretty sure that you won't have to deal with really large files, then cache the lines in an array. When doing operations that require two passes, it is very common to only deal with one file at a time and require that the file be seekable. So using seek() (and dieing if that fails) is often a very good choice.

In the very rare case where you need to do two passes over multiple files, some of which might be very large and some of which may not be seekable, I'd do something like this:

use IO::File; my $cache= IO::File->new_tmpfile() or die "Can't create temporary file: $!\n"; print $cache $_ or die "Can't append to temporary file: $!\n" while defined( $_= <> ); seek( $cache, 0, 0 ) or die "Can't rewind temporary file: $!\n"; while( <$cache> ) { ... } seek( $cache, 0, 0 ) or die "Can't rewind temporary file: $!\n"; while( <$cache> ) { ... }

Of course, this doesn't work if you don't have enough temporary file space. But that puts the problem where it belongs: in the hands of the person trying to deal with such huge files who should arrange to have enough temporary space.

        - tye (but my friends call me "Tye")

In reply to (tye)Re: Using in multiple passes by tye
in thread Using in multiple passes by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.