in reply to Re: Re: Re: Re: (Guildenstern) Re: Re: Taming a memory hog
in thread Taming a memory hog

You might like to try this. It sorts an 80MB file in around 5 mins and consumes less than 10 MB of ram.

I will also handle sorting files up to the 4 GB filesystem limit using less than 50 MB of ram, though it will obviously run somewhat more slowly. Given more information about the form of the records, it would be possible to tailor the algorithm to speed the processing.

#! perl -slw use strict; open IN, '+<', $ARGV[0] or die $!; my @splits; my $pos = 0; while( <IN> ) { $splits[ unpack 'n', substr( $_, 0, 2) ] .= pack( 'V', $pos ) . substr $_, 2, 4; $pos = tell IN; } @splits = grep $_, @splits; my $n; for my $split ( @splits ) { $split = join'', map{ substr $_, 0, 4 } sort{ my( $as, $at, $bs, $bt ) = ( unpack( 'VA4', $a ), unpack( 'VA4 +', $b ) ); $at cmp $bt || do{ seek IN, $as, 0; scalar <IN> } cmp do{ seek IN, $bs, 0; scalar <IN> } } unpack '(A8)*', $split; } for my $split ( @splits ) { printf do{ seek IN, $_, 0; scalar <IN> } for unpack 'V*', $split; }

Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"Think for yourself!" - Abigail
Hooray!
Wanted!