The facts I've found so far:
- every line is 80 characters exactly
- the lines need to be grouped by the field at offsets 2..10
- the groups need to be sorted by the field at offsets 11..14
- this second field is a coded time that needs to be decoded
A possible strategy would be to first 'index' the file, by reading it line by line;
(untested code follows)
my %index;
my $line=0;
while (<FILE>) {
my ($location,$time) = /^..(.{9})(.{4})/;
push @{$index{$location}},[$time,$line];
$line++;
}
This would results in a hash keyed on the 'location', with the value being a reference to an array with contain the info you need to sort the lines. This seems to be the minumum amount of info needed to determine the sort order.
The next step is to sort the arrays by the time values, you've stored, and fetch the lines in order from the file:
(untested code again)
for my $location (keys %index) {
my @sorted = sort { $a->[0] <=> $b->[0]}
@{$index{$location}};
for my $entry (@sorted) {
seek FILE, 81 * $entry->[1], 0;
read FILE, $line, 80;
print $line,"\n";
}
}
This method should be very memory efficient I think, and not too slow either; the biggest slowdown is probably the seeking around in the file.
This method works because we know the lengths of records. If we don't we could use the tell function before we read a line, to also store the exact start position of the line in the index... |