in reply to Re^6: Sort big text file - byte offset - 50% there (Added code)
in thread Sort big text file - byte offset - 50% there
I think the problem is in your indexing loop (and goes right back to your OP).
my @index; while (<BIGLOG>){ my $offset = tell BIGLOG; ### This offset is the start of the *nex +t* line! my $epoch = ( /^\s*#/ or /^\s\n/ or $_ !~ /^\s*\d/ ) ? 0 : Mktime( unpack 'A4xA2xA2xA2xA2xA2', $_ ); push @index, pack 'NN', $epoch, $offset; }
You read a line, record the file position, and then pair that file position with the epoch info from the line you read. But that offset is the start of the next line, not the one you just read. The result is that all the offsets are one line displaced, so that when you come to try and read, having seek'd to the last offset (which is end of file), there is nothing left to read, so it fails.
You need to recast that loop something like this:
my( $offset, @index ) = 0; ## The first lines offset is zero while (<BIGLOG>){ my $epoch = ( /^\s*#/ or /^\s\n/ or $_ !~ /^\s*\d/ ) ? 0 : Mktime( unpack 'A4xA2xA2xA2xA2xA2', $_ ); push @index, pack 'NN', $epoch, $offset; ## Pair with previous off +set $offset = tell BIGLOG; ## and now get the start of the next line + }
You probably should be checking the return code from seek also.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^8: Sort big text file - byte offset - 50% there (Added code)
by msalerno (Beadle) on Aug 14, 2006 at 21:13 UTC | |
by msalerno (Beadle) on Aug 31, 2006 at 21:05 UTC | |
by BrowserUk (Patriarch) on Aug 31, 2006 at 22:46 UTC | |
by BrowserUk (Patriarch) on Sep 02, 2006 at 15:45 UTC |