I had a need to sort an array of hashrefs by two keys. I had been using a Schwartzian Transform for this, but lo and behold, I actually was slowing things down. The Sort::Key package by salva is 4-5 times faster than the built-in sort.

And the API for sorting is much clearer than what I have to type for the Perl sort or my ST sort.

RESULTS

s/iter merlyn plain salva merlyn 1.42 -- -28% -83% plain 1.03 38% -- -77% salva 0.241 490% 327% --

Ye 0lde Source Code

#!/usr/bin/perl =head1 DESCRIPTION The purpose of this program is to compare sorting speed on a dataset. The dataset is an arrayref of hashrefs. Each hashref contains 3 fields +: text - a text field m - a numeric field n - another numeric field The data is to be sorted by m and n, similar to the SQL query: SELECT * FROM dataset ORDER BY m ASC, n ASC Note: I now realize that the Schwartzian Transform is not being used i +n a good scenario. If it were expensive to re-compute m and/or n for eac +h sort comparison, then the Transform would do good. But as it is, I'm trying to substitute hashref access to m and n with arrayref access, a +nd that is not saving any time. =cut use strict; use warnings; use Benchmark qw(:all) ; use Storable; use Acme::Wabby; use Sort::Key::Multi qw(nnkeysort); our @out; our $set_size = 100_000 ; my @data; my $data = build_data(); cmpthese( 10 => { 'merlyn' => sub { merlyn($data) }, 'salva' => sub { salva($data) }, 'plain' => sub { plain($data) }, } ); sub plain { my $data = shift; @out = sort { $a->{m} <=> $b->{m} || $a->{n} <=> $b->{n} } @$data ; return @out; } sub merlyn { my $data = shift; @out = map { $->[0] } sort { $a->[1] <=> $b->[1] || $a->[2] <=> $b->[2] } map { [$_, $_->{m}, $_->{n}] } @$data ; return @out; } sub salva { my $data = shift; @out = nnkeysort { $_->{m}, $_->{n} } @$data ; } sub build_data { my @data; my $text = <<EOF; From fairest creatures we desire increase, That thereby beauty's rose might never die, But as the riper should by time decease, His tender heir might bear his memory: But thou contracted to thine own bright eyes, Feed'st thy light's flame with self-substantial fuel, Making a famine where abundance lies, Thy self thy foe, to thy sweet self too cruel: Thou that art now the world's fresh ornament, And only herald to the gaudy spring, Within thine own bud buriest thy content, And tender churl mak'st waste in thine garding: Pity the world, or else this glutton be, To eat the world's due, by the grave and thee. EOF my $wabby = Acme::Wabby->new; $wabby->add($text); for my $i (1 .. $set_size) { my $m = rand($i); my $n = rand($m); push @data , { text => $wabby->spew, m => $m, n => $n } ; } \@data; }

I have beheld the tarball of 22.1 on ftp.gnu.org with my own eyes. How can you say that there is no God in the Church of Emacs? -- David Kastrup
[tag://sort,etl,data]

In reply to Sort::Key::Multi example by metaperl

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.