Common elements of a hash of hashes

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Common elements of a hash of hashes by blazar (Canon) on Oct 07, 2005 at 09:31 UTC
`#!/usr/bin/perl use strict; use warnings; use Data::Dumper; my %hoh = ( elt1 => { A => "ccc", B => "ccc", c => "ccc", }, elt2 => { A => "ccc", C => "ccc", D => "ccc", B => "ccc", }, elt3 => { A => "ccc", E => "ccc", C => "ccc", }, ); my @keys=keys %hoh; my %saw; for (@keys) { $saw{lc,}++ for keys %{ $hoh{$_} }; } for (@keys) { my $h=$hoh{$_}; $saw{lc,}<@keys and delete $h->{$_} for keys %$h; } print Dumper \%hoh; __END__` [download]	[reply] [d/l]
Re: Common elements of a hash of hashes by Perl Mouse (Chaplain) on Oct 07, 2005 at 08:56 UTC
#!/usr/bin/perl use strict; use warnings; my %HoH = ( elt1 => { A => "ccc", B => "ccc", C => "ccc", }, elt2 => { A => "ccc", C => "ccc", D => "ccc", B => "ccc", }, elt3 => { A => "ccc", E => "ccc", C => "ccc", }, ); my %count; my $count = keys %HoH; while (my ($name, $hash) = each %HoH) { while (my ($key) = each %$hash) { $count{$key}++; } } while (my ($name, $hash) = each %HoH) { while (my ($key) = each %$hash) { delete $$hash{$key} unless $count{$key} == $count; } } use Data::Dumper; print Dumper \%HoH; __END__ $VAR1 = { 'elt2' => { 'A' => 'ccc', 'C' => 'ccc' }, 'elt3' => { 'A' => 'ccc', 'C' => 'ccc' }, 'elt1' => { 'A' => 'ccc', 'C' => 'ccc' } }; [download] `Perl --((8:>*`	[reply] [d/l]
Re^2: Common elements of a hash of hashes by Skeeve (Parson) on Oct 07, 2005 at 09:29 UTC
Almost exactly what I wrote... How do you like this version? ;-) `use strict; use warnings; my %HoH = ( elt1 => { A => "ccc", B => "ccc", C => "ccc", }, elt2 => { A => "ccc", C => "ccc", D => "ccc", B => "ccc", }, elt3 => { A => "ccc", E => "ccc", C => "ccc", }, ); my %count; my $count= scalar keys %HoH; for ( [ grep { $count!=$count{$_} } map { ++$count{$_};$_ } map { keys + %{$HoH{$_}} } keys %HoH ] ) { foreach my $hash (values %HoH) { delete(@$hash{@$_}); } } use Data::Dumper; print Dumper(\%HoH);` [download] `$\=~s;s.;q^\|D9JYJ^^qq^\//\\\///^;ex;print`	[reply] [d/l] [select]
Re^3: Common elements of a hash of hashes by Perl Mouse (Chaplain) on Oct 07, 2005 at 10:20 UTC
How do you like this version? Not obscure enough for obfuscated code, and too obscure for normal code. I never liked `for (SCALAR) {... $_ ...}` as an alternative for `{my $var = SCALAR; ... $var ...}`. `Perl --((8:>*`	[reply] [d/l] [select]
Re: Common elements of a hash of hashes by Skeeve (Parson) on Oct 07, 2005 at 10:06 UTC
Just to get a bit more obscure... `#!/usr/bin/perl use strict; use warnings; my %HoH = ( elt1 => { A => "ccc", B => "ccc", C => "ccc", }, elt2 => { A => "ccc", C => "ccc", D => "ccc", B => "ccc", }, elt3 => { A => "ccc", E => "ccc", C => "ccc", }, ); my %count; my $count= scalar keys %HoH; my $k= [grep {$count==++$count{$_}} map {keys %{$HoH{$_}}} keys %HoH]; foreach my $h (values %HoH) { %$h= map {$_,$h->{$_}} @$k; } use Data::Dumper; print Dumper(\%HoH);` [download] `$\=~s;s.;q^\|D9JYJ^^qq^\//\\\///^;ex;print`	[reply] [d/l] [select]
Re: Common elements of a hash of hashes by Fang (Pilgrim) on Oct 07, 2005 at 10:45 UTC
The entries are in, let's see how they fare in the much awaited Benchmark competition. ;-) Read more... (2 kB) Behold the results... Benchmark: timing 5000 iterations of Blazar, Perl_Mouse, Skeeve1, Skeeve2... Blazar: 7 wallclock secs ( 3.61 usr + 1.36 sys = 4.97 CPU) @ 1006.04/s (n=5000) Perl_Mouse: 7 wallclock secs ( 3.54 usr + 1.36 sys = 4.90 CPU) @ 1020.41/s (n=5000) Skeeve1: 7 wallclock secs ( 3.48 usr + 1.35 sys = 4.83 CPU) @ 1035.20/s (n=5000) Skeeve2: 8 wallclock secs ( 3.64 usr + 1.35 sys = 4.99 CPU) @ 1002.00/s (n=5000) Rate Skeeve2 Blazar Perl_Mouse Skeeve1 Skeeve2 1002/s -- -0% -2% -3% Blazar 1006/s 0% -- -1% -3% Perl_Mouse 1020/s 2% 1% -- -1% Skeeve1 1035/s 3% 3% 1% -- Update: modified the code and benchmark results, which were actually meaningless, thanks to ikegami's wise and knowledgeable advice. And as he rightly predicted, the difference between each method really isn't that huge.	[reply] [d/l]
Re^2: Common elements of a hash of hashes by ikegami (Patriarch) on Oct 07, 2005 at 16:20 UTC
Your results are garbage. 1) The %HoH you pass as an argument is a global from some package (main?), not the `my %HoH` which is not visible to that code when its compiled by Benchmark. Instead of `Perl_Mouse => 'perl_mouse(\%HoH)',` use `Perl_Mouse => sub { perl_mouse(\%HoH) },` to avoid this problem. 2) The first call of the first function will remove the uncommon elements. For every subsequent call to that function and to the other functions, %HoH will only contains keys common to all hashes. You need to deeply copy %HoH before passing it to the functions. Overall, I don't expect a major difference in speed between the different versions. Perl Mouse's will probably be the slowest (since it does all its looping in Perl, where as the others do it in C), but 1) it uses minimal memory, and more importantly, 2) it's the easiest to read. Using `foreach values` instead of `while each` will probably speed it up at the expense of memory.	[reply] [d/l] [select]
Re^3: Common elements of a hash of hashes by Fang (Pilgrim) on Oct 07, 2005 at 20:53 UTC
The first call of the first function will remove the uncommon elements. For every subsequent call to that function and to the other functions, %HoH will only contains keys common to all hashes. You need to deeply copy %HoH before passing it to the functions. Alright, I felt this would be an issue, but thought I had resolved it by passing a reference to each sub and then dereferencing it. Unfortunately, I missed the fact that the values of the hash were references, which would modify the original values in place... I plead guilty. To deeply copy the structure, I've just read about the `dclone` function from `Storable.pm`, I believe this is what's needed. I'll rewrite the small script in hope it can be more meaningful and useful than what I previously posted.	[reply]
Re^2: Common elements of a hash of hashes by blazar (Canon) on Oct 09, 2005 at 09:12 UTC
Update: modified the code and benchmark results, which were actually meaningless, thanks to ikegami's wise and knowledgeable advice. And as he rightly predicted, the difference between each method really isn't that huge. Also note that my solution does something slightly different than the others. In fact I tried to adhere strictly to the specifications as of the wanted input and output examples given in the first post. Indeed you'll notice that the `'c'` key appears both lowercase and uppercase: this seems probable to be just a typo, which is why others plainly ignored it; but since you could deal with it just fine with virtually no more effort, I just did it -- for instructive reasons.	[reply] [d/l]
Re^2: Common elements of a hash of hashes by Anonymous Monk on Oct 07, 2005 at 14:57 UTC
There is more than one way to do it... Thanks very much for your help (and also for the benchmarking). I did not think that my question would create such a competition ;) I will need some time to digest some piece of code before understand it, especially the solutions provided by skeeve.	[reply]
Re: Common elements of a hash of hashes by Skeeve (Parson) on Oct 07, 2005 at 08:56 UTC
Here is the idea for an algorithm for that: count all keys into a new hash Delete all those entries from the new hash where the count is equal to the number of "eltX" elements What remains in the new hash are the keys that are to be delete `$\=~s;s.;q^\|D9JYJ^^qq^\//\\\///^;ex;print`	[reply] [d/l]
Re: Common elements of a hash of hashes by Anonymous Monk on Oct 07, 2005 at 16:52 UTC
Sorry but some syntaxic piece of code are really obscur. i am not as comfortable as you with perl. do you mind giving me a short explanation on the elements hereunder? `sub perl_mouse { my %HoH = %{+shift}; # I know @_ to get the parameter but not this + one } $saw{lc,} # what is lc,? grep { $count!=$count{$_} } map { ++$count{$_};$_ } map { keys %{$HoH{ +$_}} } keys %HoH # the result of this line is stored in which variable?` [download] thanks	[reply] [d/l]
Re^2: Common elements of a hash of hashes by Hue-Bond (Priest) on Oct 07, 2005 at 19:06 UTC
my %HoH = %{+shift}; Did I miss something? I don't see the word "shift" in the entire thread. Anyway, that takes the first parameter, dereferences it as a hashref and assigns the resulting hash to %HoH. The plus sign is needed for preventing Perl to take $shift as a reference and instead use the `shift` function. `shift` without arguments operates on `@_` (UPDATE: or `@ARGV` depending on the scope). `lc` is "lowercase". I'm not sure about the comma. grep { $count!=$count{$_} } map { ++$count{$_};$_ } map { keys %{$HoH{$_}} } keys %HoH the result of this line is stored in which variable? That returns a list but, since it's surrounded with [ and ], the `for` loop gets a reference to that list. It's not stored anywhere. It's visible inside the loop as `$_`. I see it the other way around, that is, `s/obscure/brilliant/`. -- David Serrano	[reply]
Re^3: Common elements of a hash of hashes by blazar (Canon) on Oct 09, 2005 at 09:14 UTC
lc is "lowercase". I'm not sure about the comma. It's there because of this: `$ cat foo.pl my %a; $_='fOo'; $a{lc}++; $a{lc()}++; $a{+lc}++; $a{lc,}++; $ perl -MO=Deparse foo.pl my %a; $_ = 'fOo'; ++$a{'lc'}; ++$a{lc $_}; ++$a{lc $_}; ++$a{lc $_}; foo.pl syntax OK` [download] So it's there for disambiguating too, just as `+shift`. I chose the latter form because in this context (involving the `++` operator) it seemed to me to be the least obtrusive and visually distracting.	[reply] [d/l] [select]
Re^2: Common elements of a hash of hashes by blazar (Canon) on Oct 09, 2005 at 09:15 UTC
Sorry but some syntaxic piece of code are really obscur. i am not as comfortable as you with perl. do you mind giving me a short explanation on the elements hereunder? `sub perl_mouse { my %HoH = %{+shift}; # I know @_ to get the parameter but not this + one } $saw{lc,} # what is lc,?` [download] As a general rule if you see a bareword like those, chances are that they may be built in functions. So, always as a general rule, you'll find that perl ('s excellent documentation) may already give you an answer; just check e.g. (in this case): perldoc -f shift perldoc -f li But since there's a subtle issue involved here, please read also the following replies, including mine to Hue-Bond's.	[reply] [d/l]
Re^3: Common elements of a hash of hashes by Anonymous Monk on Oct 11, 2005 at 22:05 UTC
Thanks for your accurate explanation. I knew the functions shift and lc. In fact, the tricky points were the comma and the plus. I have the habit to read the documentation but I don't see very often this kind of syntax in it. to imagine how works the array reference of a grep followed by 2 map in a loop is not easy for everybody I guess. but thank to the monks, everybody can learn. I see it the other way around, that is, s/obscure/brilliant/. I would say s/obscure/brilliant_but_obuscure/ ;	[reply]