File operations aren't always as white and black as they seem. The operating system does alot of file caching behind the scenes, so you probably won't take as bad a performance hit as you'd think if you open and close the same file several times.

Have you tried Tie::File yet?
It also does caching and deferred writes and other types of optimization.
More importantly, you can give an upper limit on the amount of memory you want Tie::File to consume, which could possibly prevent excessive swapping.

However, that discussion aside...my main point (which I think you might have missed) was that I don't think you have to parse it 4 times. I could be wrong (as I don't know all the facts), but can't any of this be done in tandem? Ie, why can't the format fix be done at the same time as the patch?

You would be patching and formatting lines (not arrays) of data on the fly. You only need a single iteration of all that data, instead of several. You could probably even do the dup checking at the same time. Just build a hash of "things seen" as you're patching/formating, and skip any dups that appear in the hash. Pseudo-code for what I'm talking about:

while( $line = <INFILEHANDLE> ) { chomp($line); $seen{$line} = 1; # for dup checking if ($seen{$line}) { next; } # for dup skipping/deleting patch_line($line); format_line($line); # ... any other code ... print OUTFILEHANDLE $line; }

In reply to Re^3: Perl Filehandle? by wojtyk
in thread Perl Filehandle? by Smersh2000

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.