Yet Another Sort Benchmarking Question

tlm has asked for the wisdom of the Perl Monks concerning the following question:

Hola monks!

I wrote the following snippet to compare various basic sorts:

use strict;
use warnings;
use Benchmark 'cmpthese';

srand 0;
my @n = map int(rand(2**7)), 1..1000;
my @a = map chr($_), @n;

cmpthese( -1,
          {
           default_n => sub { ( sort               @n )[ 0 ] },
           cmp_n     => sub { ( sort { $a cmp $b } @n )[ 0 ] },
           '<=>'     => sub { ( sort { $a <=> $b } @n )[ 0 ] },
           default_a => sub { ( sort               @a )[ 0 ] },
           cmp_a     => sub { ( sort { $a cmp $b } @a )[ 0 ] },
          }
        );

__END__
            Rate     cmp_n default_n     cmp_a default_a       <=>
cmp_n      667/s        --       -0%       -3%       -3%      -40%
default_n  667/s        0%        --       -3%       -3%      -40%
cmp_a      691/s        4%        4%        --        0%      -38%
default_a  691/s        4%        4%        0%        --      -38%
<=>       1110/s       66%       66%       61%       61%        --
[download]

I'm having a hard time rationalizing why <=> did so much better than the default alphabetic sort, since it was my understanding that the latter should be better. I think the comparison above is fair; am I wrong?

BTW, I figure that the reason why default_n fares worse than default_a is that the former is comparing strings of length up to 3, while the latter is comparing strings of length 1.

the lowliest monk

Comment on Yet Another Sort Benchmarking Question Download Code

Replies are listed 'Best First'.
Re: Yet Another Sort Benchmarking Question by dave_the_m (Monsignor) on May 08, 2005 at 14:03 UTC
I'm having a hard time rationalizing why <=> did so much better than the default alphabetic sort, since it was my understanding that the latter should be better I don't know what gave you that idea. Its a lot quicker at the hardware level to compare two integers than two strings. As an aside, note that perl specifically optimizes `{$a cmp $b}` and `{$a <=> $b}`, and doesn't actually call the perl code for each pair comparison Dave.	[reply] [d/l] [select]
Re^2: Yet Another Sort Benchmarking Question by tlm (Prior) on May 08, 2005 at 14:28 UTC
I don't know what gave you that idea. What gave me the idea is are statements (probably misconstrued by me) in Guttman and Rosler's paper on sort optimization asserting or implying that Perl's default lexicographic sort is faster than any sort that uses a user-specified comparison sub. As an aside, note that perl specifically optimizes {$a cmp $b} and {$a <=> $b}, and doesn't actually call the perl code for each pair comparison This answers my question. It suggests that ~~some of the claims in the cited paper are no longer valid~~ there are a few special cases for which the assertion in that paper that the default lexicographic sort is faster than any sort using a user-supplied sortsub no longer holds. Thanks. Update: I realized, after reading dave_the_m's and tilly's replies, that, in its original form, the second sentence of my last paragraph may have given the impression that the paper I cited is somehow out-of-date. That was not at all my intent; I've tried to edit the problematic sentence to avoid this false impression. the lowliest monk	[reply]
Re^3: Yet Another Sort Benchmarking Question by dave_the_m (Monsignor) on May 08, 2005 at 15:16 UTC
This answers my question. It suggests that some of the claims in the cited paper are no longer valid The basic message of the paper is still true though. Dave.	[reply]
Re^3: Yet Another Sort Benchmarking Question by tilly (Archbishop) on May 08, 2005 at 19:43 UTC
That claim remains true for any sort operation that has not been special cased inside of perl. The number of exceptions to their claim can currently be counted on one hand.	[reply]
Re: Yet Another Sort Benchmarking Question by salva (Canon) on May 09, 2005 at 09:30 UTC
both `@foo = sort { $a cmp $b } @bar; @foo = sort { $a <=> $b } @bar;` [download] are optimized by the perl compiler and instead of the perl comparison block an equivalent C function is used.	[reply] [d/l]