jcrush has asked for the wisdom of the Perl Monks concerning the following question:

Hello, I am not a novice, but I am having a hard time coding my thought out solutions. I was seeking advanced wisdom on a tricky subject of arrays and hashing.

I have to perform statistics on how many unique MAC addresses ($usrMac), per Access Point ($apMac) per hour. I have all the foundational regexes to parse the data, I'm just stuck on coding a solution, to my thought process.

Unknowns: number of access points, number of user devices, and number of hours in a day (-: joking :-)
use strict; use warnings; use diagnostics; use Data::Dumper qw(Dumper); our (@uniqueMac); while (<>) { my ($dateHour,$usrMac,$apMac) = (/regex/); push (@usrMac, $usrMac); @uniqueMac = map {$_ = 1} @usrMac; push (@array, $dateHour{$apMac} = scalar(@uniqueMac); } print @array;
Output: should look like
Date , Hour: AP# : Unique Devices per AP per Hour $dateHour : $apMac : scalar(@uniqueMac) 2014-04-07, Hour 01: AP1 : 301 2014-04-07, Hour 01: AP2 : 313 2014-04-07, Hour 01: AP3 : 132 . . . every hour per day . . . 2014-04-07, Hour 21: AP1 : 130 2014-04-07, Hour 21: AP2 : 310 2014-04-07, Hour 21: AP3 : 13 . . . for every day . . . 2014-04-08, Hour 01: AP1 : 302 2014-04-08, Hour 01: AP2 : 321 2014-04-08, Hour 01: AP3 : 131 . . . every hour per day . . . 2014-04-08, Hour 22: AP1 : 122 2014-04-08, Hour 22: AP2 : 234 2014-04-08, Hour 22: AP3 : 311
Thanks, jcrush

Replies are listed 'Best First'.
Re: Array of Hashes of Arrays with Counts of Unique Elements
by roboticus (Chancellor) on Apr 25, 2014 at 19:40 UTC

    jcrush:

    Assuming you're parsing log files (naturally sorted in order by time), I'd suggest something like this:

    my ($prevDateHour,%curHour); while (<>) { my ($dateHour, $usrMac, $apMac) = parsit(); if ($dateHour ne $prevDateHour) { for my $ap (sort keys %curHour) { print $prevDateHour, $ap, scalar(keys %{$curHour{$ap}}), " +\n"; } %curHour=(); $prevDateHour = $dateHour; } $curHour{$apMac}{$usrMac} = 1; }

    (If they're not sorted, or you have multiple files, merge/sort them into a single file first.) This solution will work no matter *how* many hours your days have. ;^D

    Update: Tweaked code a little (reset curHour, prevDateHour so reporting could work correctly).

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

      Thank you, roboticus.

      Yes, I will be merging the files from 6 servers into either 1 large file or using the diamond <> or @ARGV for multiple consecutive files.

      Thanks, jcrush