in reply to Re^9: Sort big text file - byte offset
in thread Sort big text file - byte offset - 50% there

It was pointed out that whilst using 'd' to pack your offsets will work, when it comes to doing the sorting, you would have to unpack the values and do a numeric comparison as sorting numeric data in it's pack'd binary for doesn't work for packed floats/doubles.

Having previously extolled the virtues of sorting numeric data in it's binary form, that doesn't sit right, so here are a couple of routines that will pack and unpack a FP value < 2**53 to an 8-byte binary form that is sortable. along with some rudimentary tests:

#! perl -slw use strict; use Math::Random::MT qw[ rand ]; sub ftob64 { return pack 'NN', int( $_[ 0 ] / 2**32 ), int( $_[ 0 ] % 2**32 ); } sub b64tof { my( $hi, $lo ) = unpack 'NN', $_[ 0 ]; return $hi * 2**32 + $lo; } for ( 1 .. 1e6 ) { my $test = int( rand 2**53 ); my $b64 = ftob64 $test; my $float = b64tof $b64; if( abs( $test - $float ) > 1e-15 ) { printf "%31.f v %31.f => diff %31.31f\n", $test, $float, abs( $test - $float ); } } my @randomBin = map{ ftob64 int rand 2**53 } 1 .. 1e6; my @sortedBin = sort @randomBin; my @sortedN = map{ b64tof $_ } @sortedBin; $sortedN[ $_ ] > $sortedN[ $_ + 1 ] and die "Error: $_ : $sortedN[ $_ ] > $sortedN[ $_ + 1 ]" for 0 .. $#sortedN - 1;

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.