Merging hashes (clobber duplicate keys)

baku has asked for the wisdom of the Perl Monks concerning the following question:

One I don't think has been asked here before, and I'm lost...

What's the best way to merge two hashes, with precedence to the values from the "mergee" over those already in the "merger"?

E.G.:

package Sample::Merge;
sub new {
 my $proto = shift; my $class = ref $proto or $proto;
 $a = { default => 'value' };
 bless $a, $class;
 %b = ( default => 'override' );
 foreach (keys %b)
 {
   $a->{$_} = $b{$_};
 }
}
[download]

I crammed that foreach down to @$a->{ +keys %b } = @b{ +keys %b }, thence my @k = keys %b; @$a->{@k} = @b{@k}, but I don't like it. Is there a better way?

(This is a huge mess, but it's integrated all over so I can't very well throw it out. BTW, the merge isn't really in the constructor.)

Comment on Merging hashes (clobber duplicate keys) Select or Download Code

Replies are listed 'Best First'.
Re (tilly) 1: Merging hashes (clobber duplicate keys) by tilly (Archbishop) on Feb 07, 2001 at 01:07 UTC
The simplest construct to do this is probably: `%somehash = (%somehash, %override);` [download] I would generally tackle this as: `$somehash{$_} = $override{$_} foreach keys %override;` [download] I have seen suggestions that it should be possible to just: `push %somehash, %override;` [download] but that isn't legal yet (if ever).	[reply] [d/l] [select]
Re: Merging hashes (clobber duplicate keys) by chipmunk (Parson) on Feb 07, 2001 at 01:07 UTC
I find that `@$a->{ EXPR }` won't work, because it expects $a to be an ARRAY ref. `@{ $a }{ EXPR }` should be used instead. Here's a slightly shorter version of your code: `@{$a}{keys %b} = values %b;` keys and values always return their results in corresponding order.	[reply] [d/l] [select]
Re: Re: Merging hashes (clobber duplicate keys) by dkubb (Deacon) on Feb 07, 2001 at 07:39 UTC
Here's an even shorter version: =) `@$a{ keys %b } = values %b;`	[reply] [d/l]
Re: Re: Merging hashes (clobber duplicate keys) by baku (Scribe) on Feb 07, 2001 at 01:31 UTC
Re: `@{ }` Mea culpa ... That final code must be what I was trying to get my brain around. I knew there had to be a way without a temporary var... Thanks greatly.	[reply] [d/l]
Re: Merging hashes (clobber duplicate keys) by MeowChow (Vicar) on Feb 07, 2001 at 11:55 UTC
Fee Fi Fo Fark, I smell an interesting benchmark. Let's run this for hash sizes of 10 and 100, and consider the cases of overwriting with hashes of equal and smaller sizes... use strict; use warnings; use Benchmark qw(cmpthese); $\| = 1; for my $size1 (10,100) { for my $size2 ($size1, $size1/5) { my (%hash1, %hash2); @hash1{1..$size1} = 'blah'; @hash2{1..$size2} = 'blah'; print "\n\n--- A hash of size $size2 overwriting another hash of s +ize $size1 ---\n"; cmpthese (-3, { merge => sub { %hash1 = (%hash1, %hash2); }, slice => sub { @hash1{keys %hash2} = values %hash2; }, loop => sub { $hash1{$_} = $hash2{$_} foreach keys %hash2; +}, }); } } ###### RESULTS ###### --- A hash of size 10 overwriting another hash of size 10 --- Benchmark: running loop, merge, slice, each for at least 3 CPU seconds +... loop: 2 wallclock secs ( 3.15 usr + 0.00 sys = 3.15 CPU) @ 55 +774.24/s (n=175410) merge: 4 wallclock secs ( 3.05 usr + 0.00 sys = 3.05 CPU) @ 33 +034.39/s (n=100854) slice: 4 wallclock secs ( 3.17 usr + 0.00 sys = 3.17 CPU) @ 85 +991.49/s (n=272937) Rate merge loop slice merge 33034/s -- -41% -62% loop 55774/s 69% -- -35% slice 85991/s 160% 54% -- --- A hash of size 2 overwriting another hash of size 10 --- Benchmark: running loop, merge, slice, each for at least 3 CPU seconds +... loop: 5 wallclock secs ( 3.18 usr + 0.00 sys = 3.18 CPU) @ 16 +9740.66/s (n=540624) merge: 2 wallclock secs ( 3.09 usr + 0.00 sys = 3.09 CPU) @ 41 +938.61/s (n=129800) slice: 4 wallclock secs ( 3.46 usr + 0.00 sys = 3.46 CPU) @ 21 +5512.84/s (n=746752) Rate merge loop slice merge 41939/s -- -75% -81% loop 169741/s 305% -- -21% slice 215513/s 414% 27% -- --- A hash of size 100 overwriting another hash of size 100 --- Benchmark: running loop, merge, slice, each for at least 3 CPU seconds +... loop: 3 wallclock secs ( 3.21 usr + 0.00 sys = 3.21 CPU) @ 65 +64.41/s (n=21098) merge: 4 wallclock secs ( 3.20 usr + 0.00 sys = 3.20 CPU) @ 35 +74.91/s (n=11454) slice: 3 wallclock secs ( 3.25 usr + 0.00 sys = 3.25 CPU) @ 98 +76.46/s (n=32138) Rate merge loop slice merge 3575/s -- -46% -64% loop 6564/s 84% -- -34% slice 9876/s 176% 50% -- --- A hash of size 20 overwriting another hash of size 100 --- Benchmark: running loop, merge, slice, each for at least 3 CPU seconds +... loop: 4 wallclock secs ( 3.18 usr + 0.00 sys = 3.18 CPU) @ 29 +304.55/s (n=93335) merge: 3 wallclock secs ( 3.18 usr + 0.00 sys = 3.18 CPU) @ 47 +37.32/s (n=15041) slice: 2 wallclock secs ( 3.20 usr + 0.00 sys = 3.20 CPU) @ 41 +374.22/s (n=132563) Rate merge loop slice merge 4737/s -- -84% -89% loop 29305/s 519% -- -29% slice 41374/s 773% 41% -- [download] Analysis: Well, it looks like chipmunk's slice solution kicks some serious booty. Also note the terrible inefficiency of the merge solution for relatively small overwrites, since it has to go and build a whole new hash, instead of just adding a few values. In all, nothing particularly shocking, but good to know. Now if we could just get that Benchmark Arena going... :) MeowChow print $/='"',(`$^X\144oc $^X\146aq1`)[-2]	[reply] [d/l]