http://qs1969.pair.com?node_id=385106


in reply to •Re^2: Benchmark, -s versus schwartzian
in thread Benchmark, -s versus schwartzian

I'd say that in the general case, GRT will be the fastest solution. If the list is large enough, the bottleneck will be in the sort function, and since the GRT has the simplest sort function possible (namely, the default one), it will outperform anything else.

Replies are listed 'Best First'.
•Re^4: Benchmark, -s versus schwartzian
by merlyn (Sage) on Aug 23, 2004 at 15:48 UTC
    By "general", I mean "mortal people can work out a solution for all complex sort criterion". You have to be a frickin' genius to work out a single string for the GRT sometimes. The ST in comparison is always straightforward: construct a list of your sortable keys for a given input item.

    -- Randal L. Schwartz, Perl hacker
    Be sure to read my standard disclaimer if this is a reply.

      Why? You can always do the GRT by using only the key (by which you sort) and an index in the string. I think of something like:

      @files = glob("/bin/*"); @sorted = @files[ map { /(\d+)$/g } sort map { sprintf "%012d %d", -s $files[$_], $_ } 0..@files-1]; print "@sorted$/";

      This is of course efficent only because of Perl optimizing the default sort subroutine. Someone who understands perl source well might be able to change sort so that it would optimize sort methods like {$h[$a]<=>$h[$b]} which would give us a possibility to sort even faster like

      my @h = map -s, @files; @results = @files[sort {$h[$a]<=>$h[$b]} 0..@h-1];
        You can always do the GRT by using only the key...
        Uh, that's sometimes the hard part. Suppose you wanted to sort by a floating point number (say, age in days), then descending by a string of varying length that might have a NUL in it, and then descending by another integer (byte size). Quick, construct the GRT for that. Not easy, huh. Trivial in the ST.

        Yes, you can always get a single GRT string for a multilevel sort. But sometimes, as I said, it takes a frickin' genius.

        -- Randal L. Schwartz, Perl hacker
        Be sure to read my standard disclaimer if this is a reply.

        I rarely find the GRT worth the bother. It is ridiculously hard to maintain GRT code, and it's not unlikely for the transformation step in a GRT to be so computionally complex that the order-by-sorted-indices method as per your second snippet beats it anyway — as seen at Re: To use a module...or not.. Order-by-sorted-indices is generally straightforward, easy to maintain, and easy to get consistent results from; yet very fast and memory efficient. I doubt I'll ever need anything else (though it's not completely impossible, of course).

        I appreciate the ST as an excercise in functional thinking, but it's been a long time since I last used it in practice.

        Makeshifts last the longest.