Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Sort::Key::Multi example

by metaperl (Curate)
on Nov 30, 2007 at 19:16 UTC ( [id://654174]=perlmeditation: print w/replies, xml ) Need Help??

I had a need to sort an array of hashrefs by two keys. I had been using a Schwartzian Transform for this, but lo and behold, I actually was slowing things down. The Sort::Key package by salva is 4-5 times faster than the built-in sort.

And the API for sorting is much clearer than what I have to type for the Perl sort or my ST sort.

RESULTS

s/iter merlyn plain salva merlyn 1.42 -- -28% -83% plain 1.03 38% -- -77% salva 0.241 490% 327% --

Ye 0lde Source Code

#!/usr/bin/perl =head1 DESCRIPTION The purpose of this program is to compare sorting speed on a dataset. The dataset is an arrayref of hashrefs. Each hashref contains 3 fields +: text - a text field m - a numeric field n - another numeric field The data is to be sorted by m and n, similar to the SQL query: SELECT * FROM dataset ORDER BY m ASC, n ASC Note: I now realize that the Schwartzian Transform is not being used i +n a good scenario. If it were expensive to re-compute m and/or n for eac +h sort comparison, then the Transform would do good. But as it is, I'm trying to substitute hashref access to m and n with arrayref access, a +nd that is not saving any time. =cut use strict; use warnings; use Benchmark qw(:all) ; use Storable; use Acme::Wabby; use Sort::Key::Multi qw(nnkeysort); our @out; our $set_size = 100_000 ; my @data; my $data = build_data(); cmpthese( 10 => { 'merlyn' => sub { merlyn($data) }, 'salva' => sub { salva($data) }, 'plain' => sub { plain($data) }, } ); sub plain { my $data = shift; @out = sort { $a->{m} <=> $b->{m} || $a->{n} <=> $b->{n} } @$data ; return @out; } sub merlyn { my $data = shift; @out = map { $->[0] } sort { $a->[1] <=> $b->[1] || $a->[2] <=> $b->[2] } map { [$_, $_->{m}, $_->{n}] } @$data ; return @out; } sub salva { my $data = shift; @out = nnkeysort { $_->{m}, $_->{n} } @$data ; } sub build_data { my @data; my $text = <<EOF; From fairest creatures we desire increase, That thereby beauty's rose might never die, But as the riper should by time decease, His tender heir might bear his memory: But thou contracted to thine own bright eyes, Feed'st thy light's flame with self-substantial fuel, Making a famine where abundance lies, Thy self thy foe, to thy sweet self too cruel: Thou that art now the world's fresh ornament, And only herald to the gaudy spring, Within thine own bud buriest thy content, And tender churl mak'st waste in thine garding: Pity the world, or else this glutton be, To eat the world's due, by the grave and thee. EOF my $wabby = Acme::Wabby->new; $wabby->add($text); for my $i (1 .. $set_size) { my $m = rand($i); my $n = rand($m); push @data , { text => $wabby->spew, m => $m, n => $n } ; } \@data; }

I have beheld the tarball of 22.1 on ftp.gnu.org with my own eyes. How can you say that there is no God in the Church of Emacs? -- David Kastrup
[tag://sort,etl,data]

Replies are listed 'Best First'.
Re: Sort::Key::Multi example
by benizi (Hermit) on Dec 02, 2007 at 21:09 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://654174]
Approved by grep
Front-paged by Arunbear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others rifling through the Monastery: (5)
As of 2024-03-28 23:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found