Hash sorting

artist has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Hash sorting by tilly (Archbishop) on May 12, 2003 at 19:01 UTC
Untested bad code. `# Pull out a list of records my @data; foreach my $section (keys %hash) { foreach my $item (keys %{$hash{$section}}) { push @data, [$hash{$section}{$item}, $section, $item]; } } # sort it @data = sort { $b->[0] <=> $a->[0] or $a->[1] cmp $b->[1] or $a->[2] cmp $b->[2] } @data; # Print it. print "Count Section Item\n"; foreach my $record (@data) { printf("%5d %7s %7s\n", @$record); }` [download]	[reply] [d/l]
Re: Re: Hash sorting by jdporter (Paladin) on May 12, 2003 at 19:17 UTC
Very good. But how about `my @data; foreach my $section (keys %hash) { foreach my $item (keys %{$hash{$section}}) { push @data, printf "%5d %7s %7s\n", $hash{$section}{$item}, $secti +on, $item; } } print sort @data;` [download] jdporter The 6th Rule of Perl Club is -- There is no Rule #6.	[reply] [d/l]
Re: Re: Re: Hash sorting by tilly (Archbishop) on May 12, 2003 at 19:56 UTC
Make the printf a sprintf and it works. But the point of my post was to demonstrate how building up the array of structures allows you to always find your way through the logic. Which is why I didn't use any tricks to find my way through the logic, didn't nest maps, etc.	[reply]
Re: Re: Re: Hash sorting by Util (Priest) on May 13, 2003 at 05:09 UTC
I often use this technique in small data-munging scripts. It is a fast variant of the Guttman Rosler Transform, and shares these limitations with GRT: You must know the maximum size of each data element. For example, if the count is more than 5 digits, then the keys will not align, and the sort will be wrong. You cannot use it to mix ascending and descending sorts. For example, tilly sorted the count descending, but section and item ascending. GRT can't do that.	[reply]
Re^4: Hash sorting (GRT) by tye (Sage) on May 13, 2003 at 15:09 UTC
Re: Re: Hash sorting by Util (Priest) on May 13, 2003 at 05:09 UTC
I wrote a very similar solution, except for the iterations in the `@data`-building block. `while ( my ($section,$v1) = each %$hash ) { while ( my ($item,$count) = each %$v1 ) { push @data, [$count, $section, $item]; } }` [download] This is not a knee-jerk premature optimization; it is how I think of the loop. I believe that, as an idiom, `while/each` should be favored over `foreach/keys` when the key and value are both needed but the order of access does not matter. And I am evangelizing. :)	[reply] [d/l] [select]
Re: Re: Re: Hash sorting by tilly (Archbishop) on May 13, 2003 at 15:38 UTC
And I believe that whether or not it should be favoured is a matter of who you are dealing with. I believe that while/each should not be used unless you understand the subtlety of context coercion that keeps you from exiting the loop early (quick, why when you needed just the section should you not just grab $section in scalar context?), understanding that you only have one iterator for the hash (what bug can that lead to?), and being aware what manipulations you cannot do to the hash while you are iterating over it. If you don't understand that clearly, or you do not wish to make sure that whoever works with the code understands this highly Perl-specific knowledge clearly, then it is much, much better to just use the foreach/keys method of iteration. (Honestly I have been avoiding having to explain certain aspects of context-coercion by careful selection of idioms, and I needed to double-check that your code as presented was always going to do the right thing...) Given that, I could work in a shop which used either idiom and be happy. But if you have people who use Perl only sometimes, and do lots with other languages, then I would suggest sticking with the foreach/keys method since there are fewer Perl-specific things that they have to remember to avoid getting burned in confusing ways.	[reply]
Re^3: Hash sorting by Aristotle (Chancellor) on May 13, 2003 at 14:06 UTC
I agree entirely. I've been baffled by how many people either never think of each or even say they actively avoid it. Especially when you're doing several lines of work with each pair, I find the each form moderately to significantly less noisy. Although I'd've named `$v1` something like `$itemcount`. `:)` Makeshifts last the longest.	[reply]
Re: Re: Hash sorting by jdporter (Paladin) on May 13, 2003 at 14:19 UTC
Perhaps we should code up a generic solution. `(UNTESTED) sub tree_paths { my $tree = shift; # assumes hashref map { my $k = $_; my $v = $tree->{$k}; ref $v ? map( [ $k, @$_ ], tree_paths( $v ) ) : [ $k, $v ] } keys %$tree } # now use it with the OP's $hash hashref print "Count Section Item\n"; for ( sort { $b->[2] <=> $a->[2] # count or $a->[0] cmp $b->[0] # section or $a->[1] cmp $b->[1] # item } tree_paths($hash) ) { my( $section, $item, $count ) = @$_; printf "%5d %7s %7s\n", $count, $section, $item; }` [download] Or perhaps we'd like records with named members: (UNTESTED) sub tree_tuples { my( $tree, @names ) = @_; my $name = shift @names; map { my $k = $_; my $v = $tree->{$k}; ref $v ? map( +{ $name => $k, %$_ }, tree_tuples( $v, @names ) ) : { $name => $k, $names[0] => $v } } keys %$tree } # now use it with the OP's $hash hashref print "Count Section Item\n"; for ( sort { $b->{'count'} <=> $a->{'count'} or $a->{'section'} cmp $b->{'section'} or $a->{'item'} cmp $b->{'item'} } tree_tuples( $hash, qw( section item count ) ) ) { printf "%5d %7s %7s\n", @{$_}{qw( count section item )}; } [download] jdporter The 6th Rule of Perl Club is -- There is no Rule #6.	[reply] [d/l] [select]
Re: Hash sorting by broquaint (Abbot) on May 12, 2003 at 19:01 UTC
`my $hash = { foo => { one => [ 1 .. 3 ], }, bar => { two => [ 1 .. 6 ], }, baz => { three => [ 1 .. 9 ], }, }; my @sorted; foreach my $s (keys %$hash) { push @sorted, map [ scalar @{$hash->{$s}{$_}}, $s, $_ ], keys %{ $hash->{$s} }; } print join(" ", @$_), $/ for sort { $a->[0] <=> $b->[0] } @sorted; __output__ 3 foo one 6 bar two 9 baz three` [download] That seems to do the trick. See. `perldsc` for more info. HTH `_________ broquaint`	[reply] [d/l]
Re: Hash sorting by suaveant (Parson) on May 12, 2003 at 21:57 UTC
Hehe... don;t know about how good an idea it is... but this seems to work and requires no secondary data structures, as you wanted. `my $hash = { foo => { one => 3 }, bar => { two => 6 }, baz => { three => 9 }, }; print join($/,sort { $b <=> $a } map { my $section = $_; map { sprintf("%5d %10s %s",$hash->{$section}{$_},$secti +on,$_) } keys %{$hash->{$section}} } keys %$hash ) ,$/;` [download] - Ant - Some of my best work - (1 2 3)	[reply] [d/l]
Re: Hash sorting by Util (Priest) on May 13, 2003 at 05:43 UTC
Tested variants that don't need `@data`: `print "@$_\n" foreach sort { $a->[0] <=> $b->[0] or $a->[1] cmp $b->[1] or $a->[2] cmp $b->[2] } map { my ($s,$v)=@$_; map {[$v->{$_}, $s, $_]} keys %$v; } map { [$_,$hash->{$_}] } keys %$hash;` [download] `print "@$_\n" foreach sort { $a->[0] <=> $b->[0] or $a->[1] cmp $b->[1] or $a->[2] cmp $b->[2] } map { my $s = $_; map { [ $hash->{$s}{$_}, $s, $_] } keys %{$hash->{$s}} } keys %$hash;` [download] I would definitely use a secondary data structure like @data. As a wise man once said, "Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live."	[reply] [d/l] [select]
Re: Hash sorting by artist (Parson) on May 12, 2003 at 19:28 UTC
Thanks for all the answers, I was hoping I can do without requiring another data structure like @data. artist	[reply]
Re: Re: Hash sorting by Aragorn (Curate) on May 12, 2003 at 20:55 UTC
How about something like this: `my @items = qw(item in array); my @sections = qw(one two three); my @count = qw(6 3 4 2 1 5); sub numerically { $a <=> $b }; foreach my $c (sort numerically @count) { foreach my $s (sort numerically @sections) { foreach my $i (sort numerically @items) { print "$c $s $i\n"; } } }` [download] Arjen Updated:Items and Sections are not numerical. Updated:Argh! Nevermind. Tiredness and programming don't mix.	[reply] [d/l]