emat has asked for the wisdom of the Perl Monks concerning the following question:

Hi all,

What's the best way for creating a new hash which contains the difference between 2 other hases?

I have two hashes built like this:

$new_values{MachineType}{Blue_machine}{Unit}{1}{Channel}{0}{Value}{15}
$new_values{MachineType}{Blue_machine}{Unit}{1}{Channel}{1}{Value}{22}
$new_values{MachineType}{Blue_machine}{Unit}{2}{Channel}{0}{Value}{19}
$new_values{MachineType}{Red_machine}{Unit}{1}{Channel}{0}{Value}{22}


$old_values{MachineType}{Blue_machine}{Unit}{1}{Channel}{0}{Value}{14}
$old_values{MachineType}{Blue_machine}{Unit}{1}{Channel}{1}{Value}{22}
$old_values{MachineType}{Blue_machine}{Unit}{2}{Channel}{0}{Value}{21}

This should be the result:

$end_values{MachineType}{Blue_machine}{Unit}{1}{Channel}{0}{Value}{14}
$end_values{MachineType}{Blue_machine}{Unit}{2}{Channel}{0}{Value}{19}
$end_values{MachineType}{Red_machine}{Unit}{1}{Channel}{0}{Value}{22}


I could itterate over %new_values and compare the keys to %old_values and when they differ copy the thing over to end_values.
But when I have a key that exists in new_values and doesn't exist in old_values, then I would get an error. I could do an "if defined" first but this would bloat the whole thing.
Is there a Module which enables me to do so in a gracious way?

Thanks in advance!

Replies are listed 'Best First'.
Re: Get the difference between 2 Hashes
by davidrw (Prior) on May 12, 2006 at 13:14 UTC
Re: Get the difference between 2 Hashes
by ruzam (Curate) on May 12, 2006 at 16:57 UTC
    You probably wouldn't consider this a gracious way, but...
    use strict; use warnings; use Data::Dumper; my %new_values; $new_values{MachineType}{Blue_machine}{Unit}{1}{Channel}{0}{Value}{15} + = 1; $new_values{MachineType}{Blue_machine}{Unit}{1}{Channel}{1}{Value}{22} + = 2; $new_values{MachineType}{Blue_machine}{Unit}{2}{Channel}{0}{Value}{19} + = 3; $new_values{MachineType}{Red_machine}{Unit}{1}{Channel}{0}{Value}{22} += 4; my %old_values; $old_values{MachineType}{Blue_machine}{Unit}{1}{Channel}{0}{Value}{14} + = 5; $old_values{MachineType}{Blue_machine}{Unit}{1}{Channel}{1}{Value}{22} + = 6; $old_values{MachineType}{Blue_machine}{Unit}{2}{Channel}{0}{Value}{21} + = 7; # Now find the difference if (my $end_values = hash_keydiff(\%new_values, \%old_values)) { print Dumper($end_values); } # level by level hash key compare # NOTE: at no time is the final hash value compared, only keys!! # returns 0 (no differences) # or a hash ref of the differences sub hash_keydiff { my ($hash1_ref, $hash2_ref) = @_; if (ref($hash1_ref) eq 'HASH') { return $hash1_ref unless ref($hash2_ref) eq 'HASH'; # iterate over hash1_ref keys my %return; foreach (keys %$hash1_ref) { if (defined $hash2_ref->{$_}) { # matching keys, dig to the next level if (my $result = hash_keydiff($hash1_ref->{$_}, $hash2_ref->{$_}) ) { $return{$_} = $result; } } else { $return{$_} = $hash1_ref->{$_} } } # iterate over hash2_ref keys # NOTE: we've already covered the matching keys above foreach (keys %$hash2_ref) { $return{$_} = $hash2_ref->{$_} unless defined $hash1_ref->{$_}; } return (keys %return) ? \%return : 0; } return (ref($hash2_ref) eq 'HASH') ? $hash2_ref : 0; }
    BTW your example result is missing some differences:

    $end_values{MachineType}{Blue_machine}{Unit}{1}{Channel}{0}{Value}{15}
    $end_values{MachineType}{Blue_machine}{Unit}{2}{Channel}{0}{Value}{21}
Re: Get the difference between 2 Hashes
by graff (Chancellor) on May 13, 2006 at 06:42 UTC
    I would agree with ruzam that a recursive function is the way to handle this, given that the hashes can differ at any intermediate level of the structure, and a path that goes down 7 layers in one hash simply does not exist beyond the fourth layer in the other hash.

    As for his remark about the problem with your sample data, I would take that a little further -- I wasn't really sure what your criteria are for the particular values you want as output:

    • the first came from "old_values", and was the lower of the two having the same hash keys (14 vs. 15)
    • the second came from "new_values", and was the lower of the two having the same hash keys (19 vs. 21)
    Was that really your intention, or just a typo in the OP?

    Anyway, I wonder if your hash structure really needs to have so many layers. Is there anything besides "MachineType" at the top level? Anything besides "Unit" at the third level? In other words, couldn't the structure really be like this:

    $values{Blue_machine}{2}{0}{19} $values{Red_machine}{1}{0}{22} ...
    For that matter, you could flatten this out, and make the comparisons trivially simple, by just concatenating the keys into a single string, instead of using them as separate layers:
    $values{"Blue_2_0_19"} $values{"Red_1_0_22"} ...
    How are you loading up the hash stuctures in the first place? (An xml parse? I suppose it might be easier in some sense to stick with the deep structure if that's the case, though I'd still look for a way to refactor the parsing to allow a flatter data structure.)
A reply falls below the community's threshold of quality. You may see it by logging in.