I had a need to sort an array of hashrefs by two keys. I had been using a Schwartzian Transform for this, but lo and behold, I actually was slowing things down. The
Sort::Key package by
salva is 4-5 times faster than the built-in sort.
And the API for sorting is much clearer than what I have to type for the Perl sort or my ST sort.
RESULTS
s/iter merlyn plain salva
merlyn 1.42 -- -28% -83%
plain 1.03 38% -- -77%
salva 0.241 490% 327% --
Ye 0lde Source Code
#!/usr/bin/perl
=head1 DESCRIPTION
The purpose of this program is to compare sorting speed on a dataset.
The dataset is an arrayref of hashrefs. Each hashref contains 3 fields
+:
text - a text field
m - a numeric field
n - another numeric field
The data is to be sorted by m and n, similar to the SQL query:
SELECT * FROM dataset ORDER BY m ASC, n ASC
Note: I now realize that the Schwartzian Transform is not being used i
+n
a good scenario. If it were expensive to re-compute m and/or n for eac
+h
sort comparison, then the Transform would do good. But as it is, I'm
trying to substitute hashref access to m and n with arrayref access, a
+nd
that is not saving any time.
=cut
use strict;
use warnings;
use Benchmark qw(:all) ;
use Storable;
use Acme::Wabby;
use Sort::Key::Multi qw(nnkeysort);
our @out;
our $set_size = 100_000 ;
my @data;
my $data = build_data();
cmpthese( 10 => {
'merlyn' => sub { merlyn($data) },
'salva' => sub { salva($data) },
'plain' => sub { plain($data) },
}
);
sub plain {
my $data = shift;
@out =
sort { $a->{m} <=> $b->{m} || $a->{n} <=> $b->{n} }
@$data ;
return @out;
}
sub merlyn {
my $data = shift;
@out =
map { $->[0] }
sort { $a->[1] <=> $b->[1] || $a->[2] <=> $b->[2] }
map { [$_, $_->{m}, $_->{n}] } @$data ;
return @out;
}
sub salva {
my $data = shift;
@out = nnkeysort { $_->{m}, $_->{n} } @$data ;
}
sub build_data {
my @data;
my $text = <<EOF;
From fairest creatures we desire increase,
That thereby beauty's rose might never die,
But as the riper should by time decease,
His tender heir might bear his memory:
But thou contracted to thine own bright eyes,
Feed'st thy light's flame with self-substantial fuel,
Making a famine where abundance lies,
Thy self thy foe, to thy sweet self too cruel:
Thou that art now the world's fresh ornament,
And only herald to the gaudy spring,
Within thine own bud buriest thy content,
And tender churl mak'st waste in thine garding:
Pity the world, or else this glutton be,
To eat the world's due, by the grave and thee.
EOF
my $wabby = Acme::Wabby->new;
$wabby->add($text);
for my $i (1 .. $set_size) {
my $m = rand($i);
my $n = rand($m);
push @data , {
text => $wabby->spew,
m => $m,
n => $n
}
;
}
\@data;
}
I have beheld the tarball of 22.1 on ftp.gnu.org with my own eyes.
How can you say that there is no God in the Church of Emacs? -- David Kastrup
|
[tag://sort,etl,data]
|