in reply to Re^2: Contextual/categorical Histogram
in thread Contextual/categorical Histogram
Performing an intersection, xor (symmetric difference), or join (union) operation on two histograms is fairly straightforward:
#! perl use strict; use warnings; use Data::Dump 'pp'; my %hist1 = ( a => 2, b => 5, c => 7, ); my %hist2 = ( b => 3, c => 1, d => 4, ); my %union = %hist1; $union{$_} += $hist2{$_} for keys %hist2; my %inter; for (keys %hist1) { if (exists $hist2{$_}) { my $val1 = $hist1{$_}; my $val2 = $hist2{$_}; $inter{$_} = ($val1 <= $val2) ? $val1 : $val2; } } my %xor; exists $hist2{$_} || ($xor{$_} = $hist1{$_}) for keys %hist1; exists $hist1{$_} || ($xor{$_} = $hist2{$_}) for keys %hist2; print "Histogram 1: ", pp(\%hist1), "\n"; print "Histogram 2: ", pp(\%hist2), "\n"; print "Union: ", pp(\%union), "\n"; print "Intersection: ", pp(\%inter), "\n"; print "XOR: ", pp(\%xor), "\n";
Output:
14:17 >perl 958_SoPW.pl Histogram 1: { a => 2, b => 5, c => 7 } Histogram 2: { b => 3, c => 1, d => 4 } Union: { a => 2, b => 8, c => 8, d => 4 } Intersection: { b => 3, c => 1 } XOR: { a => 2, d => 4 } 14:17 >
(See also How do I merge two hashes? and How can I get the unique keys from two hashes?)
However, it is doubtful that this approach will scale to accommodate hashes containing gigabytes of data. For that scenario, you should probably be looking to use a database.
Hope that helps,
| Athanasius <°(((>< contra mundum | Iustus alius egestas vitae, eros Piratica, |
|
|---|