Faster Common Hash key hunt

JayBonci has asked for the wisdom of the Perl Monks concerning the following question:

Dearest Monks, your assistance is requested. In dealing with several large hashes for a possible up and coming feature for a website, I need to merge a couple of large HASHes. And I'd like an efficient way to do it. Basically, I want to take two or more hashes, and quickly get back an array of items common to the two (or more). Should be easy right?

Now, I could do it iteratively, like this:

my @common;
push @common, $_ if($$bar{$_}) foreach(keys(%$foo));
[download]

Sure sure, that's great. I can grab each hash and run it through this operation. It can be optimized it by putting whichever hash has fewer keys as $foo, sure, but I really feel that I'm not taking advantage to any sort of internal organization that the hashes may have. Is there some sort of lower level operation that will give me an array (or whatever), that's common to two (or more) of them, without destroying any of the hashes.

I've looked pretty throughly on perlfunc, and have turned up short. Is there a way to speed up this mass comparison? Would destruction of the arrays help? Thanks for your time. I hope I'm not missing anything major. Ideally, i'd love:

my @foo = commonkeys($foo, $bar, $splat, $woo, ...);
[download]

Anything you guys can think of that isn't in the vein of my current approach? Thanks for your time gentle monks. Searching for cycles, --jaybonci

Comment on Faster Common Hash key hunt Select or Download Code

Replies are listed 'Best First'.
Re: Faster Common Hash key hunt. by Zaxo (Archbishop) on Jan 24, 2002 at 05:34 UTC
`my @common = grep { exists $bar{$_} } keys %foo;` Update: ++jlf for sharp eyes and kind acts. Typo repaired. Update²: ++maverick for the heads-up and the good solution. Here is another way to do it as a function with a list of hashrefs: `sub commonkeys { my %common = %{-shift}; for my $hr (@_) { delete @common{ grep { !exists $hr->{$_} } keys %common }; } [keys %common] }` [download] This doesn't do much copying. Returns an array ref to keys common to all. After Compline, Zaxo	[reply] [d/l] [select]
Re: Faster Common Hash key hunt. by talexb (Chancellor) on Jan 24, 2002 at 05:57 UTC
Alternatively there is Hash::Merge which is designed specifically for this type of problem. --t. alex "Of course, you realize that this means war." -- Bugs Bunny.	[reply]
Re: Faster Common Hash key hunt. by maverick (Curate) on Jan 24, 2002 at 05:41 UTC
I don't know if this is the fastest way, but I've used this before and it doesn't seem to be too slow. It will use a bit of memory though, as you end up with a hash that has a union of the keys off all the hashes. `# untested, supports the interface you describe sub commonkeys { my $number_of_hashes = scalar(@_); my %union; my @result; foreach (@_) { foreach (keys %{$_}) { $union{$_}++; } } while (my ($key,$count) = each %union) { if ($count == $number_of_hashes) { # a key which has appeared the same number of +times as we have hashes, is present in every hash. push(@result,$key); } } return @result; }` [download] Odds are if you do a super search or look in Q/A, somebody else has had to do this too. HTH /\/\averick perl -l -e "eval pack('h*','072796e6470272f2c5f2c5166756279636b672');"	[reply] [d/l]
Re: Faster Common Hash key hunt by herveus (Prior) on Jan 24, 2002 at 22:22 UTC
Howdy! Consider also Set::Scalar. If your hashes are Set::Scalar::Valued sets, you can perform all set operations including intersection. Ordinary sets simply have members which exist or not. Valued sets extends the model so that each member has a scalar value. yours, Michael	[reply]