I have always heard that between these two methods of maintaining a data file the second was faster:

1) Read the file line by line (writing each line to a temp file) looking for the block that needs to be updated, change it, write the remaining lines to the temp file, then overwrite the origional file with the temp file.

2) Read the entire contents of the file into an array, step through the array looking for the block that needs to be updated, change it, then write back to the file.

After a recent disscussion on a message board I frequent, I decided to test it for myself. I came up with the following code which consistently prooves the first method faster. I tested it using text files that ranged in size from 30MB to 90MB. Is the first method really faster or is there something in the way I'm implementing the second method that slows it down (like perhaps using push to add the data to the second array)
#!/usr/bin/perl -w use strict; my $stime = time(); my $filename = "file"; my $tempfile = "temp"; my $line; open(OLD, "< $filename") or die "can't open $filename: $!"; open(NEW, "> $tempfile") or die "can't open $tempfile: $!"; while ($line = <OLD>) { # a code block to evaluate the current line # and possibly update it goes here print NEW $line or die "can't write $tempfile: $!"; } close(OLD) or die "can't close $filename: $!"; close(NEW) or die "can't close $tempfile: $!"; rename($filename, "$filename.bak") or die "can't rename $filename: $!" +; rename($tempfile, $filename) or die "can't rename $tempfile: $!"; my $ftime = time(); my $etime = $ftime - $stime; print "$etime\n"; $stime = time(); open(DATA,"$filename") or die "can't open $filename: $!"; my @data = <DATA>; my @vads; close(DATA)or die "can't close $filename: $!"; foreach $line (@data) { # a code block to evaluate the current line # and possibly update it goes here push(@vads,$line); } open(DATA,">$filename") or die "can't open $filename: $!"; foreach $line (@vads) { print DATA $line; } close(DATA) or die "can't close $filename: $!"; $ftime = time(); $etime = $ftime - $stime; print "$etime\n";
John

In reply to Speed differences in updating a data file by Cyrnus

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.