tlm has asked for the wisdom of the Perl Monks concerning the following question:

Hola monks!

I wrote the following snippet to compare various basic sorts:

use strict; use warnings; use Benchmark 'cmpthese'; srand 0; my @n = map int(rand(2**7)), 1..1000; my @a = map chr($_), @n; cmpthese( -1, { default_n => sub { ( sort @n )[ 0 ] }, cmp_n => sub { ( sort { $a cmp $b } @n )[ 0 ] }, '<=>' => sub { ( sort { $a <=> $b } @n )[ 0 ] }, default_a => sub { ( sort @a )[ 0 ] }, cmp_a => sub { ( sort { $a cmp $b } @a )[ 0 ] }, } ); __END__ Rate cmp_n default_n cmp_a default_a <=> cmp_n 667/s -- -0% -3% -3% -40% default_n 667/s 0% -- -3% -3% -40% cmp_a 691/s 4% 4% -- 0% -38% default_a 691/s 4% 4% 0% -- -38% <=> 1110/s 66% 66% 61% 61% --

I'm having a hard time rationalizing why <=> did so much better than the default alphabetic sort, since it was my understanding that the latter should be better. I think the comparison above is fair; am I wrong?

BTW, I figure that the reason why default_n fares worse than default_a is that the former is comparing strings of length up to 3, while the latter is comparing strings of length 1.

the lowliest monk

Replies are listed 'Best First'.
Re: Yet Another Sort Benchmarking Question
by dave_the_m (Monsignor) on May 08, 2005 at 14:03 UTC
    I'm having a hard time rationalizing why <=> did so much better than the default alphabetic sort, since it was my understanding that the latter should be better
    I don't know what gave you that idea. Its a lot quicker at the hardware level to compare two integers than two strings.

    As an aside, note that perl specifically optimizes  {$a cmp $b} and {$a <=> $b}, and doesn't actually call the perl code for each pair comparison

    Dave.

      I don't know what gave you that idea.

      What gave me the idea is are statements (probably misconstrued by me) in Guttman and Rosler's paper on sort optimization asserting or implying that Perl's default lexicographic sort is faster than any sort that uses a user-specified comparison sub.

      As an aside, note that perl specifically optimizes {$a cmp $b} and {$a <=> $b}, and doesn't actually call the perl code for each pair comparison

      This answers my question. It suggests that some of the claims in the cited paper are no longer valid there are a few special cases for which the assertion in that paper that the default lexicographic sort is faster than any sort using a user-supplied sortsub no longer holds. Thanks.

      Update: I realized, after reading dave_the_m's and tilly's replies, that, in its original form, the second sentence of my last paragraph may have given the impression that the paper I cited is somehow out-of-date. That was not at all my intent; I've tried to edit the problematic sentence to avoid this false impression.

      the lowliest monk

        This answers my question. It suggests that some of the claims in the cited paper are no longer valid
        The basic message of the paper is still true though.

        Dave.

        That claim remains true for any sort operation that has not been special cased inside of perl. The number of exceptions to their claim can currently be counted on one hand.
Re: Yet Another Sort Benchmarking Question
by salva (Canon) on May 09, 2005 at 09:30 UTC
    both
    @foo = sort { $a cmp $b } @bar; @foo = sort { $a <=> $b } @bar;
    are optimized by the perl compiler and instead of the perl comparison block an equivalent C function is used.