Re: Hash sorting
by tilly (Archbishop) on May 12, 2003 at 19:01 UTC
|
# Pull out a list of records
my @data;
foreach my $section (keys %hash) {
foreach my $item (keys %{$hash{$section}}) {
push @data, [$hash{$section}{$item}, $section, $item];
}
}
# sort it
@data = sort {
$b->[0] <=> $a->[0]
or $a->[1] cmp $b->[1]
or $a->[2] cmp $b->[2]
} @data;
# Print it.
print "Count Section Item\n";
foreach my $record (@data) {
printf("%5d %7s %7s\n", @$record);
}
| [reply] [d/l] |
|
|
my @data;
foreach my $section (keys %hash) {
foreach my $item (keys %{$hash{$section}}) {
push @data, printf "%5d %7s %7s\n", $hash{$section}{$item}, $secti
+on, $item;
}
}
print sort @data;
jdporter The 6th Rule of Perl Club is -- There is no Rule #6. | [reply] [d/l] |
|
|
Make the printf a sprintf and it works.
But the point of my post was to demonstrate how building up the array of structures allows you to always find your way through the logic. Which is why I didn't use any tricks to find my way through the logic, didn't nest maps, etc.
| [reply] |
|
|
I often use this technique in small data-munging scripts. It is a fast variant of the Guttman Rosler Transform, and shares these limitations with GRT:
- You must know the maximum size of each data element. For example, if the count is more than 5 digits, then the keys will not align, and the sort will be wrong.
- You cannot use it to mix ascending and descending sorts. For example, tilly sorted the count descending, but section and item ascending. GRT can't do that.
| [reply] |
|
|
|
|
I wrote a very similar solution, except for the iterations in the @data-building block.
while ( my ($section,$v1) = each %$hash ) {
while ( my ($item,$count) = each %$v1 ) {
push @data, [$count, $section, $item];
}
}
This is not a knee-jerk premature optimization; it is how I think of the loop. I believe that, as an idiom, while/each should be favored over foreach/keys when the key and value are both needed but the order of access does not matter.
And I am evangelizing. :) | [reply] [d/l] [select] |
|
|
And I believe that whether or not it should be favoured is a matter of who you are dealing with.
I believe that while/each should not be used unless you understand the subtlety of context coercion that keeps you from exiting the loop early (quick, why when you needed just the section should you *not* just grab $section in scalar context?), understanding that you only have one iterator for the hash (what bug can that lead to?), and being aware what manipulations you cannot do to the hash while you are iterating over it.
If you don't understand that clearly, or you do not wish to make sure that whoever works with the code understands this highly Perl-specific knowledge clearly, then it is much, much better to just use the foreach/keys method of iteration. (Honestly I have been avoiding having to explain certain aspects of context-coercion by careful selection of idioms, and I needed to double-check that your code as presented was always going to do the right thing...)
Given that, I could work in a shop which used either idiom and be happy. But if you have people who use Perl only sometimes, and do lots with other languages, then I would suggest sticking with the foreach/keys method since there are fewer Perl-specific things that they have to remember to avoid getting burned in confusing ways.
| [reply] |
|
|
I agree entirely. I've been baffled by how many people either never think of each or even say they actively avoid it. Especially when you're doing several lines of work with each pair, I find the each form moderately to significantly less noisy.
Although I'd've named $v1 something like $itemcount. :)
Makeshifts last the longest.
| [reply] |
|
|
Perhaps we should code up a generic solution.
(UNTESTED)
sub tree_paths
{
my $tree = shift; # assumes hashref
map
{
my $k = $_;
my $v = $tree->{$k};
ref $v
? map( [ $k, @$_ ], tree_paths( $v ) )
: [ $k, $v ]
}
keys %$tree
}
# now use it with the OP's $hash hashref
print "Count Section Item\n";
for (
sort {
$b->[2] <=> $a->[2] # count
or
$a->[0] cmp $b->[0] # section
or
$a->[1] cmp $b->[1] # item
} tree_paths($hash) )
{
my( $section, $item, $count ) = @$_;
printf "%5d %7s %7s\n", $count, $section, $item;
}
Or perhaps we'd like records with named members:
(UNTESTED)
sub tree_tuples
{
my( $tree, @names ) = @_;
my $name = shift @names;
map
{
my $k = $_;
my $v = $tree->{$k};
ref $v
? map( +{ $name => $k, %$_ }, tree_tuples( $v, @names ) )
: { $name => $k, $names[0] => $v }
}
keys %$tree
}
# now use it with the OP's $hash hashref
print "Count Section Item\n";
for (
sort {
$b->{'count'} <=> $a->{'count'}
or
$a->{'section'} cmp $b->{'section'}
or
$a->{'item'} cmp $b->{'item'}
} tree_tuples( $hash, qw( section item count ) ) )
{
printf "%5d %7s %7s\n", @{$_}{qw( count section item )};
}
jdporter The 6th Rule of Perl Club is -- There is no Rule #6. | [reply] [d/l] [select] |
Re: Hash sorting
by broquaint (Abbot) on May 12, 2003 at 19:01 UTC
|
my $hash = {
foo => { one => [ 1 .. 3 ], },
bar => { two => [ 1 .. 6 ], },
baz => { three => [ 1 .. 9 ], },
};
my @sorted;
foreach my $s (keys %$hash) {
push @sorted, map [ scalar @{$hash->{$s}{$_}}, $s, $_ ],
keys %{ $hash->{$s} };
}
print join(" ", @$_), $/
for sort { $a->[0] <=> $b->[0] } @sorted;
__output__
3 foo one
6 bar two
9 baz three
That seems to do the trick. See. perldsc for more info.
HTH
_________ broquaint | [reply] [d/l] |
Re: Hash sorting
by suaveant (Parson) on May 12, 2003 at 21:57 UTC
|
Hehe... don;t know about how good an idea it is... but this seems to work and requires no secondary data structures, as you wanted.
my $hash = {
foo => { one => 3 },
bar => { two => 6 },
baz => { three => 9 },
};
print join($/,sort { $b <=> $a }
map { my $section = $_;
map { sprintf("%5d %10s %s",$hash->{$section}{$_},$secti
+on,$_)
} keys %{$hash->{$section}}
} keys %$hash )
,$/;
- Ant
- Some of my
best work - (1 2 3)
| [reply] [d/l] |
Re: Hash sorting
by Util (Priest) on May 13, 2003 at 05:43 UTC
|
print "@$_\n" foreach sort {
$a->[0] <=> $b->[0] or
$a->[1] cmp $b->[1] or
$a->[2] cmp $b->[2]
} map {
my ($s,$v)=@$_;
map {[$v->{$_}, $s, $_]} keys %$v;
} map {
[$_,$hash->{$_}]
} keys %$hash;
print "@$_\n" foreach sort {
$a->[0] <=> $b->[0] or
$a->[1] cmp $b->[1] or
$a->[2] cmp $b->[2]
} map {
my $s = $_;
map { [ $hash->{$s}{$_}, $s, $_] } keys %{$hash->{$s}}
} keys %$hash;
I would definitely use a secondary data structure like @data. As a wise man once said, "Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live."
| [reply] [d/l] [select] |
Re: Hash sorting
by artist (Parson) on May 12, 2003 at 19:28 UTC
|
Thanks for all the answers, I was hoping I can do without requiring another data structure like @data.
artist | [reply] |
|
|
| [reply] [d/l] |