Re: A heap of medians: efficiency vs. speed

Another recent topic here led me to this article: www.sysarch.com/Perl/sort_paper.html. It shows that of two seemingly identical sorts:

@fast = sort @unsorted;
@slow = sort { $a <=> $b } @unsorted;
[download]

The first one is faster because it runs entirely in C in the Perl core, while the second is slower because the sortsub executes in Perl code.

I was sorting tens of thousands of files by creation time in reverse chronological order, and was delighted to find the first one below faster than the second:

@fast = reverse sort @unsorted;
@slow = sort { $b <=> $a } @unsorted;
[download]

kyle, I wonder how your median_sort might benchmark if it used my @s = sort @_;?

Comment on Re: A heap of medians: efficiency vs. speed Select or Download Code

Replies are listed 'Best First'.
Re^2: A heap of medians: efficiency vs. speed by kyle (Abbot) on Jan 13, 2009 at 19:29 UTC
I'll start by pointing out that the two sorts you show are not identical. `use List::Util qw( shuffle ); my @n = shuffle 5 .. 15; print join( q{, }, sort @n ), "\n"; print join( q{, }, sort { $a <=> $b } @n ), "\n"; __END__ 10, 11, 12, 13, 14, 15, 5, 6, 7, 8, 9 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15` [download] That's because sort, by default, uses "`cmp`" rather than "`<=>`" for comparisons. That's sort of beside your point, however. A sort with the comparison unspecified will be faster than one that's explicit, and there's a convenient special case for "`reverse sort`". That's all good. It's sort of beside the point of my meditation, however. The point here was to compare efficiency and speed. With the tied scalars, I was able to show that the heap method was more efficient—it did fewer comparisons. In spite of that, it was still slower because sort does its comparisons so much faster due to being optimized C. In any case, thanks for your comment.	[reply] [d/l] [select]

Replies are listed 'Best First'.

Re^2: A heap of medians: efficiency vs. speed
by kyle (Abbot) on Jan 13, 2009 at 19:29 UTC

I'll start by pointing out that the two sorts you show are not identical.

use List::Util qw( shuffle );

my @n = shuffle 5 .. 15;
print join( q{, }, sort @n ), "\n";
print join( q{, }, sort { $a <=> $b } @n ), "\n";
__END__
10, 11, 12, 13, 14, 15, 5, 6, 7, 8, 9
5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
[download]

That's because sort, by default, uses "cmp" rather than "<=>" for comparisons. That's sort of beside your point, however. A sort with the comparison unspecified will be faster than one that's explicit, and there's a convenient special case for "reverse sort". That's all good.

It's sort of beside the point of my meditation, however. The point here was to compare efficiency and speed. With the tied scalars, I was able to show that the heap method was more efficient—it did fewer comparisons. In spite of that, it was still slower because sort does its comparisons so much faster due to being optimized C.

In any case, thanks for your comment.

[reply]
[d/l]
[select]