Count number of elements in HoH and sum them for other keys

Sosi has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Count number of elements in HoH and sum them for other keys by smls (Friar) on Jun 03, 2014 at 11:54 UTC
Your specification for the desired result hash is not quite valid: You cannot have a lone array reference in a hash, only `key => value` pairs. This can be fixed by giving the array references a key, for example "values". The result hash would then look like this: `( A => { count => 3, B => { count => 2, values => ["n1", "n2"] }, C => { count => 1, values => ["n1" ] } }, D => { count => 3, E => { count => 2, values => ["n2", "n4"] }, F => { count => 1, values => ["n1" ] } } )` [download] It can easily be generated.... ...from the HoH that you already have: You can transform the existing HoH into the specified result hash, using two (nested) loops, and making use of the fact that `scalar @array` gives the number of elements in an array: `my %hoh = ( A => { B => [ "n1", "n2" ], C => [ "n1" ] }, D => { E => [ "n2", "n4" ], F => [ "n1" ] }, ); foreach my $col1 (keys %hoh) { my $count1 = 0; foreach my $col2 (keys %{$hoh{$col1}}) { my $count2 = scalar @{$hoh{$col1}{$col2}}; $hoh{$col1}{$col2} = { count => $count2, values => $hoh{$col1}{$col2} }; $count1 += $count2; } $hoh{$col1}{count} = $count1; }` [download] ...from the original data: If you do the counting directly in the code that generates the HoH in the first place, it's even easier - just increment the counters for both levels as you go along: `my %hoh; while (<DATA>) { chomp; my ($c1, $c2, $c3) = split; $hoh{$c1}{count}++; $hoh{$c1}{$c2}{count}++; push @{$hoh{$c1}{$c2}{values}}, $c3; } __DATA__ A B n1 A B n2 A C n1 D E n2 D E n4 D F n1` [download] --- Edit: Refactored the answer to make it more structured.	[reply] [d/l] [select]
Re^2: Count number of elements in HoH and sum them for other keys by Sosi (Sexton) on Jun 03, 2014 at 12:55 UTC
thank you so much. I was a bit confused at the beginning: I thought that you had to specify a starting value for $hoh{$c1}{$c2}{count}. By the way, and I know I'm going a bit astray of the initial question, but what if I wanted to start that count at 5? Would specifying `$hoh{$c1}{count}=5;` [download] work if I specified it before incrementing in your while loop? Thanks!	[reply] [d/l]
Re^3: Count number of elements in HoH and sum them for other keys by smls (Friar) on Jun 03, 2014 at 13:21 UTC
I thought that you had to specify a starting value for $hoh{$c1}{$c2}{count}. When you dereference or modify a non-existing array or hash element, it will automatically "spring to life", including all the necessary intermediate hashes/arrays. For example: `my %test; $test{a}[2]{b} = 'Hello'; # %test now contains: # ( a => [ undef, # undef, # { b => "Hello" } ] )` [download] It's called autovivification, and it's one of the nice features that make Perl special... :) See Wikipedia and perlreftut for more info. In addition, the ++ (auto-increment) operator silently treats `undef` as `0`. So you don't need to specify an initial value. what if I wanted to start that count at 5? One solution would be to create the hash first, and then use another loop to add 5 to each counter. Alternatively, you can do a check inside the loop (before incrementing!) to see if the counter has already been incremented previously, and if not, initialize it with the number 5: `if (!$hoh{$c1}{count}) { $hoh{$c1}{count} = 5; } # verbose form` [download] `$hoh{$c1}{count} \|\|= 5; # shortcut` [download] (See C style Logical Or and Assignment Operators.)	[reply] [d/l] [select]
Re: Count number of elements in HoH and sum them for other keys by BillKSmith (Monsignor) on Jun 03, 2014 at 12:33 UTC
A simple hash is all you need. Use the things you want to count as keys. `use strict; use warnings; my %hash; while (<DATA>) { my @elements = split; $hash{join ' => ', @elements[0,1]}++; $hash{$elements[0]}++; } foreach my $key (sort keys %hash) { printf "%-6s => %d\n", $key, $hash{$key}; } __DATA__ A B n1 A B n2 A C n1 D E n2 D E n4 D F n1` [download] OUTPUT: `A => 3 A => B => 2 A => C => 1 D => 3 D => E => 2 D => F => 1` [download] Bill	[reply] [d/l] [select]
Re: Count number of elements in HoH and sum them for other keys by kcott (Archbishop) on Jun 03, 2014 at 13:54 UTC
G'day Sosi, Parsing tab-, comma-, whatever-separated files has various issues that have already been dealt with by Text::CSV [see also: Text::CSV_XS and Text::CSV_PP]. This is probably not a wheel you need to reinvent: I've shown usage of Text::CSV in the example code (below). Your output data structure is flawed. You have instances of this general code: `X => { count => n, [ ... ] }` [download] You have three elements in the hashref, which is a problem: key/values pairs result in an even number of elements. That generates an "Odd number of elements in anonymous hash" warning. In the example code (below), I've added a "`values`" key for both the arrayref (i.e. the third element) and the top-level key. `X => { count => n, values => [ ... ] }` [download] Here's the example code: `#!/usr/bin/env perl -l use strict; use warnings; use autodie; use Text::CSV; my $csv = Text::CSV::->new({sep_char => "\t"}); my %data; while (my $row = $csv->getline(\*DATA)) { ++$data{$row->[0]}{count}; ++$data{$row->[0]}{values}{$row->[1]}{count}; push @{$data{$row->[0]}{values}{$row->[1]}{values}}, $row->[2]; } use Data::Dump; dd \%data; __DATA__ A B n1 A B n2 A C n1 D E n2 D E n4 D F n1` [download] Which outputs: `{ A => { count => 3, values => { B => { count => 2, values => ["n1", "n2"] }, C => { count => 1, values => ["n1"] }, }, }, D => { count => 3, values => { E => { count => 2, values => ["n2", "n4"] }, F => { count => 1, values => ["n1"] }, }, }, }` [download] -- Ken	[reply] [d/l] [select]
Re^2: Count number of elements in HoH and sum them for other keys (sledgehammer) by Anonymous Monk on Jun 03, 2014 at 16:50 UTC
> Parsing tab-, comma-, whatever-separated files has various issues that have already been dealt with by Text::CSV Like what? That module deals with all the features and intricacies of the 'official' CSV format, such as quoting/escaping and embedding newlines/NULLbytes. But the OP never specified the input data be in that complex format; from the looks of it it's just simple ASCII strings delimited by tabs and newlines. No need to take a sledgehammer to crack a nut... > This is probably not a wheel you need to reinvent Applying a simple, tried-and-true Perl idiom is not much of an 'invention': `while(<>) { my @fields = split /\t/; ... }`	[reply] [d/l]
Re: Count number of elements in HoH and sum them for other keys by Anonymous Monk on Jun 03, 2014 at 11:49 UTC
Seems like an exercise in linked list data structure that accounts for multiple occurrences of a value. For your ideal hash reference structure when you are populating the hash reference, (1) stick a "count" key, value of which is incremented (& starts with 1) just in the manner you would add "B" key of a hash reference value for "A" key; (2) assign the array reference value also a key to have a valid hash reference.	[reply]
Re^2: Count number of elements in HoH and sum them for other keys by Anonymous Monk on Jun 03, 2014 at 11:53 UTC
And to multiply, iterate over the keys, via `keys` function, of each hash reference to find the count; store them; then multiply. See also "perlref" & "perlreftut" PODs.	[reply] [d/l]
Re^3: Count number of elements in HoH and sum them for other keys by Sosi (Sexton) on Jun 03, 2014 at 12:51 UTC
thanks your comments were quite helpful. I read them before seeing the reply of smls below, and was implementing something similar to what he ended up doing. Thanks	[reply]

...from the HoH that you already have:

...from the original data: