Sorting a hash of arrays based on a particular array element.

aditya.singh has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Sorting a hash of arrays based on a particular array element. by BrowserUk (Patriarch) on Jul 27, 2005 at 09:27 UTC
Supply the keys of `%hash` to sort and then use `$hash{ $a \| $b }[0]` to access your sort field: `%hash = ( 1 => [ 51, 'a2'], 2 => [42, 'a1'] );; # NOTE: ^______ parens not curlies _________^ print "$_ => @{$hash{$_}}" for sort{ $hash{$a}[0] <=> $hash{$b}[0] } keys %hash;; 2 => 42 a1 1 => 51 a2` [download] Realise that you will need sort the hash each time you want to iterate it in sorted order (or use one of the sorted hash implementations on CPAN ). Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal? "Science is about questioning the status quo. Questioning authority". The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.	[reply] [d/l] [select]
Re: Sorting a hash of arrays based on a particular array element. by gellyfish (Monsignor) on Jul 27, 2005 at 09:31 UTC
You don't want the curlies when assigning to a hash, you will get this error: `Reference found where even-sized list expected` [download] /J\	[reply] [d/l]
Re: Sorting a hash of arrays based on a particular array element. by Random_Walk (Prior) on Jul 27, 2005 at 09:21 UTC
You need the Schwartzian Transform. If you do a search or super search on sorting you will find an abundance of nodes. You could also look up the the Orcish Manouver (a pun on or cache). There is also a great paper on sorting A Fresh Look at Efficient Perl Sorting Update As merlyn points out below you do not need Schwartzian/Orcish as your sort keys are already there. If you have a large amount of data to sort you may get some efficiency gain from the "Packed Default" method in the Guttman & Rosler paper A Fresh Look at Efficient Perl Sorting Here is an example of the Packed Default method ... `%hash=( 1 => [ 51, 'a2'], 2 => [42, 'a1'] ); print "$_ => @{$hash{$_}}" for map substr($_,2), sort map {pack("C2", $hash{$_}[0])."$_"} keys %hash;` [download] but for simplicity go for one of the solutions below. Cheers, R. Pereant, qui ante nos nostra dixerunt!	[reply] [d/l]
Re^2: Sorting a hash of arrays based on a particular array element. by merlyn (Sage) on Jul 27, 2005 at 13:37 UTC
The ST or Orcish help when the sort keys are expensive to compute. These sort keys are already there! No computation to make. -- Randal L. Schwartz, Perl hacker Be sure to read my standard disclaimer if this is a reply.	[reply]
Re^2: Sorting a hash of arrays based on a particular array element. by salva (Canon) on Jul 27, 2005 at 13:39 UTC
not too long ago I released Sort::Key on CPAN, it outperforms the Schwartzian Transform and the methods described on the cited paper in almost any case and is easier to use.	[reply]
Re: Sorting a hash of arrays based on a particular array element. by jbrugger (Parson) on Jul 27, 2005 at 09:25 UTC
You might get more information about it here as well. update: As BrowserUk stated to use parens not curlies, i do think it's an idea to read about References as well. "We all agree on the necessity of compromise. We just can't agree on when it's necessary to compromise." - Larry Wall.	[reply]
Re: Sorting a hash of arrays based on a particular array element. by astroboy (Chaplain) on Jul 27, 2005 at 11:34 UTC
I'm not sure what you want in your result - either the first item in the array reference, or the array references being sorted by the first value. I'm going for the latter, but I'm sure that you can modify as required: `use Sort::Key qw(nkeysort); use Data::Dumper; my %hash = ( 1 => [ 51, 'a2'], 2 => [42, 'a1'] ); my @res = nkeysort { $_->[0]} (values %hash); print Dumper(\@res);` [download] giving `$VAR1 = [ [ 42, 'a1' ], [ 51, 'a2' ] ];` [download]	[reply] [d/l] [select]
Re: Sorting a hash of arrays based on a particular array element. by marto (Cardinal) on Jul 27, 2005 at 09:25 UTC
Hi, A Super Search returns many results for this. Reading How do I post a question effectively? suggests steps to take before asking for help. Martin	[reply]
Re: Sorting a hash of arrays based on a particular array element. by Random_Walk (Prior) on Jul 27, 2005 at 16:03 UTC
I read the A Fresh Look at Efficient Perl Sorting paper last week and have been itching to play with packed default sorting so here we go, packed default verses the example code BrowserUK gave above. Sorting two million random records 10 times. =>cat /tmp/sorttest #!/usr/opt/perl5/bin/perl use strict; use warnings; use Benchmark; srand(42); # the answer my $elements = 2000000; my %hash; for (1..$elements) { my ($key, $one, $two) = rand =~/(\d\.\d{2})(\d{2})(\d{4})/; $hash{$key} = [$one, $two]; } sub packed_default { my @sorted_keys = map substr($_,1), sort map {pack('C', $hash{$_}[0])."$_"} keys %hash; return @sorted_keys; } sub BroweserUK { my @sorted_keys = sort{ $hash{$a}[0] <=> $hash{$b}[0] } keys %hash +; } timethese(10, { 'packed_default' => \&packed_default, 'BroweserUK' => \&BroweserUK, }); __end__ Benchmark: timing 10 iterations of BroweserUK, packed_default... BroweserUK: 0 wallclock secs (0.07 usr + 0.00 sys = 0.07 CPU) @ 142.86/s (n=10) (warning: too few iterations for a reliable count) packed_default: 0 wallclock secs (0.04 usr + 0.00 sys = 0.04 CPU) @ 250.00/s (n=10) (warning: too few iterations for a reliable count) [download] As they suggest in the paper coercing perl to use the default sort instead of custom sub is almost twice as fast. Cheers, R. Pereant, qui ante nos nostra dixerunt!	[reply] [d/l]
Re^2: Sorting a hash of arrays based on a particular array element. by salva (Canon) on Jul 27, 2005 at 18:54 UTC
let me extend your benchmark with an entry for Sort::Key: `... use Sort::Key qw(ikeysort); sub SortKey { my @sorted_keys = ikeysort { $hash{$_}[0] } keys %hash; } cmpthese(10, { BrowserUK => \&BrowserUK, packed_default => \&packed_default, SortKey => \&SortKey }); __end__ Rate BrowserUK packed_default SortKey BrowserUK 50.0/s -- -30% -45% packed_default 71.4/s 43% -- -21% SortKey 90.9/s 82% 27% --` [download] BTW, in my computer the difference between packed and BrowserUK method is not so big as on your test.	[reply] [d/l]
Re^3: Sorting a hash of arrays based on a particular array element. by Random_Walk (Prior) on Jul 28, 2005 at 08:39 UTC
Wow, most impressive. How do you do the sorting under the covers, can you point me to some online resources ? As it is so much faster than perl's sort will it ever make it into core and replace the current sort or does it gain speed by having different interfaces for sorting different data types ? Cheers, R. Pereant, qui ante nos nostra dixerunt!	[reply]
Re^4: Sorting a hash of arrays based on a particular array element. by salva (Canon) on Jul 28, 2005 at 10:10 UTC
Re^3: Sorting a hash of arrays based on a particular array element. by Random_Walk (Prior) on Jul 28, 2005 at 12:28 UTC
Oh silly silly me ... 2 million values thrown into a hash with only 100 possible keys !!! # for (1..$elements) { # my ($key, $one, $two) = rand =~/(\d\.\d{2})(\d{2})(\d{4})/; # $hash{$key} = [$one, $two]; #} for (1..$elements) { my ($one, $two) = rand =~/\d+.(\d{6})(\d{4})/; $hash{$_} = [$one, $two]; } # Also reduced $elements to 200_000 and 2 million # exceeded per proc memeory limit. # I also use more entropy in the field to sort on # requiring a change of packed_default to ... sub packed_default { my @sorted_keys = map substr($_,4), sort map {pack('N', $hash{$_}[0])."$_"} keys %hash; return @sorted_keys; } Benchmark: timing 10 iterations of BroweserUK, packed_default... BroweserUK: 188 wallclock secs (175.63 usr + 0.13 sys = 175.76 CPU) @ 0.06/s (n=10) packed_default: 72 wallclock secs (64.61 usr + 0.16 sys = 64.77 CPU) @ 0.15/s (n=10) [download] Sadly I am on a company machine with a restrictive policy about installing modules so I can not test Sort::Key here, will give it a bash at home this evening. Cheers, R. Pereant, qui ante nos nostra dixerunt!	[reply] [d/l]
Re^4: Sorting a hash of arrays based on a particular array element. by salva (Canon) on Jul 28, 2005 at 13:57 UTC

Update