in reply to nested hashes or object oriented

"Once I get these lines of stats, I can then throw the results into a database for more advanced querying and storage."

Seems to me that this is advanced querying. :) SQL has some handy tools to group and count. Anyways, since you are going to use a database later on, sounds like this script is a throw away. If so, OO is waaay overkill. Just get the job done procedurally and move on.

My first thoughts were to use DBD::AnyData, but it doesn't support "advanced" SQL commands such as HAVING and GROUP BY ... shame, but here is my take on this problem with that module. Note that i don't quite fully understand what you are exactly trying to do with your counting (i didn't bother with 'new' and 'unique' hits), but this should serve as a starting point should you wish to explore DBD::AnyData. Also note that i changed your file from space delimeted to tab delimted and named the file data.txt and replaced all instances of agnecy to agency - was that typo intentional? You also had once agency listed as null - occording to your desired output, i guesstimated that the agency was number 2.

use strict; use warnings; use DBI; use Data::Dumper; my $dbh = DBI->connect('dbi:AnyData(RaiseError=>1):'); $dbh->func( qw(log Tab data.txt), { col_names => 'date,user,agency,url,garbage,success,module' }, 'ad_catalog', ); my %pass = fetch_rows('pass'); my %fail = fetch_rows('fail'); print join("\t",qw(Agency Url Pass Fail Module)),"\n"; # loop thru @pass, try to fetch from @fail # i'll let you decide how to sort ;) for my $key (keys %pass) { my ($agency,$url,$module) = split(':',$key); my $pass = $pass{$key}; my $fail = $fail{$key} || 0; print join("\t",$agency,$url,$pass,$fail,$module),"\n"; } # trick here is to append agency, url, and module # so they are treated as one unique entity # too bad you can't use unique with AnyDBD ... # watch out, i use a colon as the delimter - YMMV sub fetch_rows { my $success = shift; my $sth = $dbh->prepare(' SELECT agency, url, module FROM log WHERE success = ? '); $sth->execute($success); my %hash; $hash{ join(':', $_->{agency}, $_->{url}, $_->{module}) }++ while $_ = $sth->fetchrow_hashref; return %hash; }
And here is the output on your data file (with said 'corrections'):
Agency  Url     Pass    Fail    Module
agency2 url1    2       1       mod1
agency1 url1    3       1       mod1
agency1 url1    1       0       mod2
agency2 url2    1       0       mod2
UPDATE: on second thought ... just listen to BrowserUk. :) The important concept is getting the right unique rows, and DBD::AnyDB is probably overkill for this problem, only because it doesn't handle COUNT and GROUP BY. (It's still a fabulous module for data converstions though.)

jeffa

L-LL-L--L-LL-L--L-LL-L--
-R--R-RR-R--R-RR-R--R-RR
B--B--B--B--B--B--B--B--
H---H---H---H---H---H---
(the triplet paradiddle with high-hat)