Re: Common elements of a hash of hashes
by blazar (Canon) on Oct 07, 2005 at 09:31 UTC
|
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
my %hoh = (
elt1 => {
A => "ccc",
B => "ccc",
c => "ccc",
},
elt2 => {
A => "ccc",
C => "ccc",
D => "ccc",
B => "ccc",
},
elt3 => {
A => "ccc",
E => "ccc",
C => "ccc",
},
);
my @keys=keys %hoh;
my %saw;
for (@keys) {
$saw{lc,}++ for keys %{ $hoh{$_} };
}
for (@keys) {
my $h=$hoh{$_};
$saw{lc,}<@keys and delete $h->{$_} for keys %$h;
}
print Dumper \%hoh;
__END__
| [reply] [d/l] |
Re: Common elements of a hash of hashes
by Perl Mouse (Chaplain) on Oct 07, 2005 at 08:56 UTC
|
#!/usr/bin/perl
use strict;
use warnings;
my %HoH = (
elt1 => {
A => "ccc",
B => "ccc",
C => "ccc",
},
elt2 => {
A => "ccc",
C => "ccc",
D => "ccc",
B => "ccc",
},
elt3 => {
A => "ccc",
E => "ccc",
C => "ccc",
},
);
my %count;
my $count = keys %HoH;
while (my ($name, $hash) = each %HoH) {
while (my ($key) = each %$hash) {
$count{$key}++;
}
}
while (my ($name, $hash) = each %HoH) {
while (my ($key) = each %$hash) {
delete $$hash{$key} unless $count{$key} == $count;
}
}
use Data::Dumper;
print Dumper \%HoH;
__END__
$VAR1 = {
'elt2' => {
'A' => 'ccc',
'C' => 'ccc'
},
'elt3' => {
'A' => 'ccc',
'C' => 'ccc'
},
'elt1' => {
'A' => 'ccc',
'C' => 'ccc'
}
};
| [reply] [d/l] |
|
|
use strict;
use warnings;
my %HoH = (
elt1 => {
A => "ccc",
B => "ccc",
C => "ccc",
},
elt2 => {
A => "ccc",
C => "ccc",
D => "ccc",
B => "ccc",
},
elt3 => {
A => "ccc",
E => "ccc",
C => "ccc",
},
);
my %count;
my $count= scalar keys %HoH;
for ( [ grep { $count!=$count{$_} } map { ++$count{$_};$_ } map { keys
+ %{$HoH{$_}} } keys %HoH ] ) {
foreach my $hash (values %HoH) {
delete(@$hash{@$_});
}
}
use Data::Dumper;
print Dumper(\%HoH);
$\=~s;s*.*;q^|D9JYJ^^qq^\//\\\///^;ex;print
| [reply] [d/l] [select] |
|
|
How do you like this version?
Not obscure enough for obfuscated code, and too obscure for normal code. I never liked for (SCALAR) {... $_ ...} as an alternative for {my $var = SCALAR; ... $var ...}.
| [reply] [d/l] [select] |
Re: Common elements of a hash of hashes
by Skeeve (Parson) on Oct 07, 2005 at 10:06 UTC
|
#!/usr/bin/perl
use strict;
use warnings;
my %HoH = (
elt1 => {
A => "ccc",
B => "ccc",
C => "ccc",
},
elt2 => {
A => "ccc",
C => "ccc",
D => "ccc",
B => "ccc",
},
elt3 => {
A => "ccc",
E => "ccc",
C => "ccc",
},
);
my %count;
my $count= scalar keys %HoH;
my $k= [grep {$count==++$count{$_}} map {keys %{$HoH{$_}}} keys %HoH];
foreach my $h (values %HoH) {
%$h= map {$_,$h->{$_}} @$k;
}
use Data::Dumper;
print Dumper(\%HoH);
$\=~s;s*.*;q^|D9JYJ^^qq^\//\\\///^;ex;print
| [reply] [d/l] [select] |
Re: Common elements of a hash of hashes
by Fang (Pilgrim) on Oct 07, 2005 at 10:45 UTC
|
The entries are in, let's see how they fare in the much awaited Benchmark competition. ;-)
Behold the results...
Benchmark: timing 5000 iterations of Blazar, Perl_Mouse, Skeeve1, Skeeve2...
Blazar: 7 wallclock secs ( 3.61 usr + 1.36 sys = 4.97 CPU) @ 1006.04/s (n=5000)
Perl_Mouse: 7 wallclock secs ( 3.54 usr + 1.36 sys = 4.90 CPU) @ 1020.41/s (n=5000)
Skeeve1: 7 wallclock secs ( 3.48 usr + 1.35 sys = 4.83 CPU) @ 1035.20/s (n=5000)
Skeeve2: 8 wallclock secs ( 3.64 usr + 1.35 sys = 4.99 CPU) @ 1002.00/s (n=5000)
Rate Skeeve2 Blazar Perl_Mouse Skeeve1
Skeeve2 1002/s -- -0% -2% -3%
Blazar 1006/s 0% -- -1% -3%
Perl_Mouse 1020/s 2% 1% -- -1%
Skeeve1 1035/s 3% 3% 1% --
Update: modified the code and benchmark results, which were actually meaningless, thanks to ikegami's wise and knowledgeable advice. And as he rightly predicted, the difference between each method really isn't that huge.
| [reply] [d/l] |
|
|
Your results are garbage.
1) The %HoH you pass as an argument is a global from some package (main?), not the my %HoH which is not visible to that code when its compiled by Benchmark. Instead of
Perl_Mouse => 'perl_mouse(\%HoH)',
use
Perl_Mouse => sub { perl_mouse(\%HoH) },
to avoid this problem.
2) The first call of the first function will remove the uncommon elements. For every subsequent call to that function and to the other functions, %HoH will only contains keys common to all hashes. You need to deeply copy %HoH before passing it to the functions.
Overall, I don't expect a major difference in speed between the different versions. Perl Mouse's will probably be the slowest (since it does all its looping in Perl, where as the others do it in C), but 1) it uses minimal memory, and more importantly, 2) it's the easiest to read. Using foreach values instead of while each will probably speed it up at the expense of memory.
| [reply] [d/l] [select] |
|
|
The first call of the first function will remove the uncommon elements. For every subsequent call to that function and to the other functions, %HoH will only contains keys common to all hashes. You need to deeply copy %HoH before passing it to the functions.
Alright, I felt this would be an issue, but thought I had resolved it by passing a reference to each sub and then dereferencing it. Unfortunately, I missed the fact that the values of the hash were references, which would modify the original values in place... I plead guilty.
To deeply copy the structure, I've just read about the dclone function from Storable.pm, I believe this is what's needed.
I'll rewrite the small script in hope it can be more meaningful and useful than what I previously posted.
| [reply] |
|
|
Update: modified the code and benchmark results, which were actually meaningless, thanks to ikegami's wise and knowledgeable advice. And as he rightly predicted, the difference between each method really isn't that huge.
Also note that my solution does something slightly different than the others. In fact I tried to adhere strictly to the specifications as of the wanted input and output examples given in the first post. Indeed you'll notice that the 'c' key appears both lowercase and uppercase: this seems probable to be just a typo, which is why others plainly ignored it; but since you could deal with it just fine with virtually no more effort, I just did it -- for instructive reasons.
| [reply] [d/l] |
|
|
There is more than one way to do it...
Thanks very much for your help (and also for the benchmarking). I did not think that my question would create such a competition ;)
I will need some time to digest some piece of code before understand it, especially the solutions provided by skeeve.
| [reply] |
Re: Common elements of a hash of hashes
by Skeeve (Parson) on Oct 07, 2005 at 08:56 UTC
|
| [reply] [d/l] |
Re: Common elements of a hash of hashes
by Anonymous Monk on Oct 07, 2005 at 16:52 UTC
|
Sorry but some syntaxic piece of code are really obscur. i am not as comfortable as you with perl. do you mind giving me a short explanation on the elements hereunder?
sub perl_mouse {
my %HoH = %{+shift}; # I know @_ to get the parameter but not this
+ one
}
$saw{lc,} # what is lc,?
grep { $count!=$count{$_} } map { ++$count{$_};$_ } map { keys %{$HoH{
+$_}} } keys %HoH
# the result of this line is stored in which variable?
thanks
| [reply] [d/l] |
|
|
my %HoH = %{+shift};
Did I miss something? I don't see the word "shift" in the entire thread. Anyway, that takes the first parameter, dereferences it as a hashref and assigns the resulting hash to %HoH. The plus sign is needed for preventing Perl to take $shift as a reference and instead use the shift function. shift without arguments operates on @_ (UPDATE: or @ARGV depending on the scope).
lc is "lowercase". I'm not sure about the comma.
grep { $count!=$count{$_} } map { ++$count{$_};$_ } map { keys %{$HoH{$_}} } keys %HoH
the result of this line is stored in which variable?
That returns a list but, since it's surrounded with [ and ], the for loop gets a reference to that list. It's not stored anywhere. It's visible inside the loop as $_.
I see it the other way around, that is, s/obscure/brilliant/.
| [reply] |
|
|
lc is "lowercase". I'm not sure about the comma.
It's there because of this:
$ cat foo.pl
my %a;
$_='fOo';
$a{lc}++;
$a{lc()}++;
$a{+lc}++;
$a{lc,}++;
$ perl -MO=Deparse foo.pl
my %a;
$_ = 'fOo';
++$a{'lc'};
++$a{lc $_};
++$a{lc $_};
++$a{lc $_};
foo.pl syntax OK
So it's there for disambiguating too, just as +shift. I chose the latter form because in this context (involving the ++ operator) it seemed to me to be the least obtrusive and visually distracting.
| [reply] [d/l] [select] |
|
|
Sorry but some syntaxic piece of code are really obscur. i am not as comfortable as you with perl. do you mind giving me a short explanation on the elements hereunder?
sub perl_mouse {
my %HoH = %{+shift}; # I know @_ to get the parameter but not this
+ one
}
$saw{lc,} # what is lc,?
As a general rule if you see a bareword like those, chances are that they may be built in functions. So, always as a general rule, you'll find that perl ('s excellent documentation) may already give you an answer; just check e.g. (in this case):
But since there's a subtle issue involved here, please read also the following replies, including mine to Hue-Bond's.
| [reply] [d/l] |
|
|
Thanks for your accurate explanation. I knew the functions shift and lc. In fact, the tricky points were the comma and the plus.
I have the habit to read the documentation but I don't see very often this kind of syntax in it. to imagine how works the array reference of a grep followed by 2 map in a loop is not easy for everybody I guess.
but thank to the monks, everybody can learn.
I see it the other way around, that is, s/obscure/brilliant/.
I would say s/obscure/brilliant_but_obuscure/ ;
| [reply] |