in reply to Re: Better mousetrap (getting top N values from list X)
in thread Better mousetrap (getting top N values from list X)

See my pad for a benchmark.


Examine what is said, not who speaks.
Silence betokens consent.
Love the truth but pardon error.

Replies are listed 'Best First'.
Re^3: Better mousetrap (getting top N values from list X)
by Aristotle (Chancellor) on Feb 02, 2005 at 00:30 UTC

    Why not post it in the thread? Here's an adapted bench with a version of my code modified slightly to be callable like the others:

    It is interesting to see that my code wins hands down when N/MAX is close to 1. Even when the ratio of N/MAX shrinks, my code loses a lot of ground but keeps beating Limbic's proposition. None of this matters much though since for large MAX, all of the solutions perform very similarly, even if the trends remain clear.

    So out of curiosity I added the following bit to the code:

    baseline => sub { my ( $n, $list ) = @_; return ( sort { $a <=> $b } @$list )[ @$list - $n .. $#$list ] +; },

    Well, I'll just say let's pack the bags and go home folks. Nothing to see here, move along. As I said: clever Perl code vs builtin: clever Perl code loses. Grossly disproportionately, in fact.

    (Spoiler for anyone who doesn't care to run the benchmarks: in all cases the baseline sort version runs hundreds to thousands of times faster than the other solutions.)

    Makeshifts last the longest.

      Your benchmark gives as input a list in sorted order. As you mentioned, mergesort is extremely good on sorted or nearly-sorted lists. You should shuffle the list first to avoid giving the builtin a slam-dunk.

      I tried it, and in a couple of cases limbic gets comparable times and even beats baseline:

      Looking for top 5 in 10000 (running for 9.5 CPU secs) Rate aristotle browseruk limbic baseline aristotle 13.7/s -- -46% -89% -92% browseruk 25.2/s 84% -- -81% -85% limbic 130/s 847% 414% -- -23% baseline 169/s 1134% 571% 30% -- Looking for top 5 in 100000 (running for 15 CPU secs) Rate aristotle browseruk baseline limbic aristotle 1.38/s -- -44% -85% -90% browseruk 2.48/s 79% -- -73% -82% baseline 9.14/s 560% 268% -- -35% limbic 14.0/s 913% 465% 53% --

        Sigh. One more of those too frequent infrequent moments where I feel collossally stupid. It's not like I didn't say right there in my first reply that the builtin performs particularly well in those cases…

        I'll take another look at the trends in the bechmarks with some shuffling added.

        Makeshifts last the longest.