Re: Merging hashes (clobber duplicate keys)

Fee Fi Fo Fark, I smell an interesting benchmark.

Let's run this for hash sizes of 10 and 100, and consider the cases of overwriting with hashes of equal and smaller sizes...

use strict;
use warnings;
use Benchmark qw(cmpthese);
$| = 1;

for my $size1 (10,100) {
  for my $size2 ($size1, $size1/5) {
    my (%hash1, %hash2);

    @hash1{1..$size1} = 'blah';
    @hash2{1..$size2} = 'blah';

    print "\n\n--- A hash of size $size2 overwriting another hash of s
+ize $size1 ---\n";
    cmpthese (-3, {
      merge     => sub { %hash1 = (%hash1, %hash2); },
      slice     => sub { @hash1{keys %hash2} = values %hash2; },
      loop      => sub { $hash1{$_} = $hash2{$_} foreach keys %hash2; 
+},
    });
  }
}


###### RESULTS ######

--- A hash of size 10 overwriting another hash of size 10 ---
Benchmark: running loop, merge, slice, each for at least 3 CPU seconds
+...
      loop:  2 wallclock secs ( 3.15 usr +  0.00 sys =  3.15 CPU) @ 55
+774.24/s (n=175410)
     merge:  4 wallclock secs ( 3.05 usr +  0.00 sys =  3.05 CPU) @ 33
+034.39/s (n=100854)
     slice:  4 wallclock secs ( 3.17 usr +  0.00 sys =  3.17 CPU) @ 85
+991.49/s (n=272937)
         Rate merge  loop slice
merge 33034/s    --  -41%  -62%
loop  55774/s   69%    --  -35%
slice 85991/s  160%   54%    --


--- A hash of size 2 overwriting another hash of size 10 ---
Benchmark: running loop, merge, slice, each for at least 3 CPU seconds
+...
      loop:  5 wallclock secs ( 3.18 usr +  0.00 sys =  3.18 CPU) @ 16
+9740.66/s (n=540624)
     merge:  2 wallclock secs ( 3.09 usr +  0.00 sys =  3.09 CPU) @ 41
+938.61/s (n=129800)
     slice:  4 wallclock secs ( 3.46 usr +  0.00 sys =  3.46 CPU) @ 21
+5512.84/s (n=746752)
          Rate merge  loop slice
merge  41939/s    --  -75%  -81%
loop  169741/s  305%    --  -21%
slice 215513/s  414%   27%    --


--- A hash of size 100 overwriting another hash of size 100 ---
Benchmark: running loop, merge, slice, each for at least 3 CPU seconds
+...
      loop:  3 wallclock secs ( 3.21 usr +  0.00 sys =  3.21 CPU) @ 65
+64.41/s (n=21098)
     merge:  4 wallclock secs ( 3.20 usr +  0.00 sys =  3.20 CPU) @ 35
+74.91/s (n=11454)
     slice:  3 wallclock secs ( 3.25 usr +  0.00 sys =  3.25 CPU) @ 98
+76.46/s (n=32138)
        Rate merge  loop slice
merge 3575/s    --  -46%  -64%
loop  6564/s   84%    --  -34%
slice 9876/s  176%   50%    --


--- A hash of size 20 overwriting another hash of size 100 ---
Benchmark: running loop, merge, slice, each for at least 3 CPU seconds
+...
      loop:  4 wallclock secs ( 3.18 usr +  0.00 sys =  3.18 CPU) @ 29
+304.55/s (n=93335)
     merge:  3 wallclock secs ( 3.18 usr +  0.00 sys =  3.18 CPU) @ 47
+37.32/s (n=15041)
     slice:  2 wallclock secs ( 3.20 usr +  0.00 sys =  3.20 CPU) @ 41
+374.22/s (n=132563)
         Rate merge  loop slice
merge  4737/s    --  -84%  -89%
loop  29305/s  519%    --  -29%
slice 41374/s  773%   41%    --
[download]

Analysis: Well, it looks like chipmunk's slice solution kicks some serious booty. Also note the terrible inefficiency of the merge solution for relatively small overwrites, since it has to go and build a whole new hash, instead of just adding a few values.

In all, nothing particularly shocking, but good to know. Now if we could just get that Benchmark Arena going... :)

   MeowChow                                               
                print $/='"',(`$^X\144oc $^X\146aq1`)[-2]

Comment on Re: Merging hashes (clobber duplicate keys) Download Code