in reply to Re^2: Perl Filehandle?
in thread Perl Filehandle?
Have you tried Tie::File yet?
It also does caching and deferred writes and other types of optimization.
More importantly, you can give an upper limit on the amount of memory you want Tie::File to consume, which could possibly prevent excessive swapping.
However, that discussion aside...my main point (which I think you might have missed) was that I don't think you have to parse it 4 times. I could be wrong (as I don't know all the facts), but can't any of this be done in tandem? Ie, why can't the format fix be done at the same time as the patch?
You would be patching and formatting lines (not arrays) of data on the fly. You only need a single iteration of all that data, instead of several. You could probably even do the dup checking at the same time. Just build a hash of "things seen" as you're patching/formating, and skip any dups that appear in the hash. Pseudo-code for what I'm talking about:
while( $line = <INFILEHANDLE> ) { chomp($line); $seen{$line} = 1; # for dup checking if ($seen{$line}) { next; } # for dup skipping/deleting patch_line($line); format_line($line); # ... any other code ... print OUTFILEHANDLE $line; }
|
|---|