This is one of those situations where if I could save a miniscule amount of time per record, it could potentially shave a half an hour off of the run time of these monster processing jobs.

If you're running this off of Win32, you can save noticeable time by periodically defragging your drives.

Regardless of the OS, you can save substantial time if the OUTPUT file you're writing is on a different physical drive than the datafiles you're reading. I takes a lot of time (relativly speaking) to move disk heads across the disk to do "read a bit here, write a bit there" operations. If you can rig things so that drive heads move relatively small amounts (e.g., from track to track) while reading or writing, you can win big.

If you have to run everything off of one drive, then consider buffering your writes to OUTPUT. Perl's buffering will wait until a disk block is full before writing, but you can increase the effective buffer size by doing something like the following in your loop.

push @buffer, join("|", @fields) . "\n"; $buffer .= "\n"; if ( --$fuse == 0 ) { print OUTPUT @buffer; @buffer = (); $fuse = $LINES_TO_BUFFER; }
Set $LINES_TO_BUFFER to something pretty big (10000 might be a good starting point), and be sure to empty the buffer at the end of the loop.


In reply to Re: Fastest I/O possible? by dws
in thread Fastest I/O possible? by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.