in reply to Re: Sorting a (very) large file
in thread Sorting a (very) large file

Or you could try something twice as fast that uses much less memory. In this case I'd likely sort parallel arrays (though it isn't too complicated to do something even faster that uses even less memory, such as fast, flexible, stable sort).

my @size= map { ( split /\t/, $_ )[2] } @in; my @idx= sort { $size[$a] <=> $size[$b] }, 0..$#in; @in= @in[@idx];

And I'd check how much paging space is configured. It sounds like that could be increased.

- tye        

Replies are listed 'Best First'.
Re^3: Sorting a (very) large file (better*2)
by samtregar (Abbot) on Nov 30, 2007 at 20:13 UTC
    I really doubt that's going to work in 1GB of RAM on a 400MB input (and don't worry about paging space - if you start paging then any speedup you gained just went bye-bye). I guess it's possible, but this line looks pretty bad to me:

       @in = @in[@idx];

    Or do you happen to know that Perl does that in-place on @in? I guess the only way to really know would be to try it, but unfortunately I do have some real work to do...

    -sam

      I believe there is this thing called "virtual memory"... Iterating over a list once doesn't have much "thrash" potential, unlike sorting.

      Update: Also note that the tons of anonymous arrays of the ST add a lot more memory than most people might think. I suspect that would be about as much or even more memory than required by the copies of each line.

      - tye        

        God damn it, you're going to make me actually try it, aren't you?

        UPDATE: What are you talking about? My solution wasn't an ST!

        -sam