It was pointed out that whilst using 'd' to pack your offsets will work, when it comes to doing the sorting, you would have to unpack the values and do a numeric comparison as sorting numeric data in it's pack'd binary for doesn't work for packed floats/doubles.
Having previously extolled the virtues of sorting numeric data in it's binary form, that doesn't sit right, so here are a couple of routines that will pack and unpack a FP value < 2**53 to an 8-byte binary form that is sortable. along with some rudimentary tests:
#! perl -slw use strict; use Math::Random::MT qw[ rand ]; sub ftob64 { return pack 'NN', int( $_[ 0 ] / 2**32 ), int( $_[ 0 ] % 2**32 ); } sub b64tof { my( $hi, $lo ) = unpack 'NN', $_[ 0 ]; return $hi * 2**32 + $lo; } for ( 1 .. 1e6 ) { my $test = int( rand 2**53 ); my $b64 = ftob64 $test; my $float = b64tof $b64; if( abs( $test - $float ) > 1e-15 ) { printf "%31.f v %31.f => diff %31.31f\n", $test, $float, abs( $test - $float ); } } my @randomBin = map{ ftob64 int rand 2**53 } 1 .. 1e6; my @sortedBin = sort @randomBin; my @sortedN = map{ b64tof $_ } @sortedBin; $sortedN[ $_ ] > $sortedN[ $_ + 1 ] and die "Error: $_ : $sortedN[ $_ ] > $sortedN[ $_ + 1 ]" for 0 .. $#sortedN - 1;
In reply to Re^10: Sort big text file - byte offset
by BrowserUk
in thread Sort big text file - byte offset - 50% there
by msalerno
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |