Messing about in Arrays of Hashes

nzgrover has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Messing about in Arrays of Hashes by graff (Chancellor) on Sep 21, 2004 at 03:37 UTC
It sounds like what you really want to do is go from an array of hashes into a single hash: `my %master_hash; for my $anon_hash_ref ( @AoH ) { $master_hash{ $$anon_hash_ref{id} } += $$anon_hash_ref{value}; }` [download] Now, %master_hash is keyed by the set of unique ids from the AoH, and its values are the sums of the values for matching ids. update (because you updated the question while I was making up the initial answer): assuming that "id_a" always relates to "value_x" and "id_b" to "value_y": `for my $anonhash ( @AoH ) { $master_hash{ $$anonhash{id_a} } += $$anonhash{value_x}; $master_hash{ $$anonhash{id_b} } += $$anonhash{value_y}; }` [download] *another update:* (because the previous update was wrong): Since the "id_a" and "id_b" values in your AoH might "intersect", this will keep them distinct: `for my $anonhash ( @AoH ) { $master_hash{ "a".$$anonhash{id_a} } += $$anonhash{value_x}; $master_hash{ "b".$$anonhash{id_b} } += $$anonhash{value_y}; }` [download] *FINAL UPDATE:* (sheesh!) Okay, based on your later clarification about the problem, I'd still suggest building a single hash as output, but now it should be either a HoH or HoA (whatever you prefer): `for my $anonhash ( @AoH ) { my $newkey = join '_', 'a', $$anonhash{id_a}, 'b', $$anonhash{id_b +}; # one way (HoH): $master_hash{$newkey}{value_x} += $$anonhash{value_x}; $master_hash{$newkey}{value_y} += $$anonhash{value_y}; # another way (HoA): $master_hash{$newkey}[0] += $$anonhash{value_x}; $master_hash{$newkey}[1] += $$anonhash{value_y}; }` [download] (Of course, you'll want to delete or comment out whichever pair of lines above you don't prefer.)	[reply] [d/l] [select]
Re^2: Messing about in Arrays of Hashes by nzgrover (Scribe) on Sep 21, 2004 at 04:35 UTC
Sorry about whipping the carpet out like that, I had tried to simplify the real world problem of course and realized after I had posted that I had gone to far.	[reply]
Re: Messing about in Arrays of Hashes by bobf (Monsignor) on Sep 21, 2004 at 04:06 UTC
Any time you need to want to eliminate duplicates, think "hash". The following code creates a hash of hashes where the keys are id_a, summing value_x and value_y as it goes. Then the array of hashes is updated using the data in the HoH, sorted by id_a. `my %HoH; foreach my $hashref ( @AoH ) { $HoH{ ${ $hashref }{id_a} }{id_a} = ${ $hashref }{id_a}; $HoH{ ${ $hashref }{id_a} }{id_b} = ${ $hashref }{id_b}; $HoH{ ${ $hashref }{id_a} }{value_x} += ${ $hashref }{value_x}; $HoH{ ${ $hashref }{id_a} }{value_y} += ${ $hashref }{value_y}; } @AoH = sort { ${ $a }{id_a} <=> ${ $b }{id_a} } ( values %HoH );` [download] This has been tested using your input data. I'm sure there are much more elegant ways of doing this... HTH Update: Added code below to meet your new criteria, as stated in this reply. The final array is still sorted on id_a. Note: since id_a and id_b are part of the hash key, you could eliminate those individual keys in the HoH, but I left them in for simplicity. `my %HoH; foreach my $hashref ( @AoH ) { my $id_ab = ${ $hashref }{id_a} . '_' . ${ $hashref }{id_b}; $HoH{$id_ab}{id_a} = ${ $hashref }{id_a}; $HoH{$id_ab}{id_b} = ${ $hashref }{id_b}; $HoH{$id_ab}{value_x} += ${ $hashref }{value_x}; $HoH{$id_ab}{value_y} += ${ $hashref }{value_y}; } @AoH = sort { ${ $a }{id_a} <=> ${ $b }{id_a} } ( values %HoH );` [download]	[reply] [d/l] [select]
Re: Messing about in Arrays of Hashes by tachyon (Chancellor) on Sep 21, 2004 at 03:53 UTC
The simplest approach is to use a temporary array but you could splice if memory is an issue. $var = [ { 'id_a' => '1', 'id_b' => '5', 'value_x' => '10', 'value_y' => '5', }, { 'id_a' => '2', 'id_b' => '3', 'value_x' => '20', 'value_y' => '10', }, { 'id_a' => '2', 'id_b' => '3', 'value_x' => '30', 'value_y' => '20', }, { 'id_a' => '3', 'id_b' => '7', 'value_x' => '15', 'value_y' => '15', }, ]; my $tmp; my $last_id = ''; for my $hash( @$var ) { if ( $hash->{id_a} eq $last_id ) { $tmp->[-1]->{value_x} += $hash->{value_x}; $tmp->[-1]->{value_y} += $hash->{value_y}; } else { $last_id = $hash->{id_a}; push @$tmp, $hash; } } use Data::Dumper; print Dumper $tmp; [download] cheers tachyon	[reply] [d/l]
Re: Messing about in Arrays of Hashes by Errto (Vicar) on Sep 21, 2004 at 03:44 UTC
A first go at it, that will work even if the ids are not sorted (assuming we start with $var): `my %hash; $hash{$_->{id}} += $_->{value} for @$var; $var = [ map { +{id => $_, value => $hash{$_} } } sort keys %hash ];` [download] Second go, based on your initial assumption, possibly more efficient but uglier for sure: `my $lastind = -1; for my $i (0 .. @$var - 1) { if ($var->[$i]->{id} eq $var->[$lastind]->{id}) { $var->[$lastind]->{value} += $var->[$i]->{value}; splice @$var, $i, 1; } else { $lastind = $i; } }` [download]	[reply] [d/l] [select]
Re: Messing about in Arrays of Hashes by graff (Chancellor) on Sep 21, 2004 at 03:56 UTC
Would you ever have an AoH like the following, and if so, what would be the right thing to do with it? `{ id_a => 1 id_b => 2 value_x => 20 value_y => 30 } { id_a => 3 id_b => 4 value_x => 40 value_y => 50 } { id_a => 1 id_b => 4 value_x => 100 value_y => 200 }` [download] Would you want "id_a==1" to come out with 120, and "id_b==4" to come out with 250? Or do you need to keep track if of distinct "id_a/b" tuples?	[reply] [d/l]
Re^2: Messing about in Arrays of Hashes by nzgrover (Scribe) on Sep 21, 2004 at 04:37 UTC
Only if BOTH id's match do i then want to add any corresponding values together and knock out the "duplicate" hash.	[reply]
Re: Messing about in Arrays of Hashes by TedPride (Priest) on Sep 21, 2004 at 08:00 UTC
The following requires an additional hash and an additional array, but both contain only pointers, so overhead should be extremely low. Enjoy... $var = [ { 'id_a' => '1', 'id_b' => '5', 'value_x' => '10', 'value_y' => '5' }, { 'id_a' => '2', 'id_b' => '3', 'value_x' => '20', 'value_y' => '10' }, { 'id_a' => '2', 'id_b' => '3', 'value_x' => '30', 'value_y' => '20' }, { 'id_a' => '3', 'id_b' => '7', 'value_x' => '15', 'value_y' => '15' }]; my ($id, %ids, @var2); for (my $i = 0; $i <= $#$var; $i++) { $id = @$var[$i]->{'id_a'} . ' ' . @$var[$i]->{'id_b'}; if (!$ids{$id}) { push(@var2, @$var[$i]); $ids{$id} = @$var[$i]; } else { $ids{$id}->{'value_x'} += @$var[$i]->{'value_x'}; $ids{$id}->{'value_y'} += @$var[$i]->{'value_y'}; } } $var = \@var2; foreach (@$var) { print '[' . $line++ . ']' . ' id_a -> ' . $_->{'id_a'} . ' id_b -> ' . $_->{'id_b'} . ' value_x -> ' . $_->{'value_x'} . ' value_y -> ' . $_->{'value_y'} . "\n"; } [download] You can cut the last part if you want - that's only to demonstrate that the code works.	[reply] [d/l]
Re^2: Messing about in Arrays of Hashes by TedPride (Priest) on Sep 21, 2004 at 08:08 UTC
Additional note - if you wish to edit this code for other arrays of hashes, just change the following lines: Creates unique ID: `$id = @$var[$i]->{'id_a'} . ' ' . @$var[$i]->{'id_b'};` Merges data: `$ids{$id}->{'value_x'} += @$var[$i]->{'value_x'}; $ids{$id}->{'value_y'} += @$var[$i]->{'value_y'};` [download]	[reply] [d/l] [select]