in reply to Re^3: Removing the first record in a file containing fixed records
in thread Removing the first record in a file containing fixed records

Contrast Tie::File taking 233 seconds:

use strict; use warnings; use Tie::File; $/ = \96; my $start = time; tie my @records, 'Tie::File', $ARGV[ 0 ]; shift @records; untie @records; printf "Time: %f seconds\n", time() - $start; __END__ 2008-07-18 14:17 527,999,712 500MB.fixed 1 File(s) 527,999,712 bytes 0 Dir(s) 2,320,445,440 bytes free C:\test>junk7 500MB.fixed Time: 233.000000 seconds C:\test>dir 500MB.fixed Volume in drive C has no label. Volume Serial Number is BCCA-B4CC Directory of C:\test 2008-07-18 23:34 527,999,616 500MB.fixed 1 File(s) 527,999,616 bytes 0 Dir(s) 2,320,416,768 bytes free

With a read-seek-write solution taking < 5 seconds:

#! perl -slw use strict; use Fcntl qw[ SEEK_CUR SEEK_SET ]; use constant BUFSIZE => 64 * 1024; my $start = time; our $RECLEN || die "you must specify the length of the header. -RECLEN +=nnn"; @ARGV or die "No filename"; open FILE, '+<:raw', $ARGV[ 0 ] or die "$!: $ARGV[ 0 ]"; sysread FILE, my $header, $RECLEN or die "sysread: $!"; my( $nextWrite, $nextRead ) = 0; while( sysread FILE, my $buffer, BUFSIZE ) { $nextRead = sysseek FILE, 0, SEEK_CUR or die "Seek query next read failed; $!"; sysseek FILE, $nextWrite, SEEK_SET or die "Seek next write failed: $!"; syswrite FILE, $buffer or die "Write failed: $!";; $nextWrite = sysseek FILE, 0, SEEK_CUR or die "Seek query next write failed $!"; sysseek FILE, $nextRead, SEEK_SET or die "Seek next Read failed: $!"; } truncate FILE, $nextWrite or die "truncate failed: $!"; close FILE or die "close failed: $!"; printf "Took: %f seconds\n", time() - $start; __END__ C:\test>dir 500MB.fixed Volume in drive C has no label. Volume Serial Number is BCCA-B4CC Directory of C:\test 2008-07-18 23:34 527,999,616 500MB.fixed 1 File(s) 527,999,616 bytes 0 Dir(s) 2,320,416,768 bytes free C:\test>698472 -RECLEN=96 500MB.fixed Took: 5.000000 seconds C:\test>dir 500MB.fixed Volume in drive C has no label. Volume Serial Number is BCCA-B4CC Directory of C:\test 2008-07-18 23:37 527,999,520 500MB.fixed 1 File(s) 527,999,520 bytes 0 Dir(s) 2,320,445,440 bytes free

I'll grant you, it does have the virtue of simplicity.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

Replies are listed 'Best First'.
Re^5: Removing the first record in a file containing fixed records
by Narveson (Chaplain) on Jul 21, 2008 at 05:59 UTC

    Thank you for sharing these timings and for writing the faster code. Both are instructive.

    Would the operation run even faster if coded in C?

      Would the operation run even faster if coded in C?

      Marginally. Maybe. A straight forward conversion of the above code to C ran in 4 seconds one time and 2 the next--probably because the file was still in the system cache. But the timing was only to the nearest second.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.