(jeffa) Re: nested hashes or object oriented

"Once I get these lines of stats, I can then throw the results into a database for more advanced querying and storage."

Seems to me that this is advanced querying. :) SQL has some handy tools to group and count. Anyways, since you are going to use a database later on, sounds like this script is a throw away. If so, OO is waaay overkill. Just get the job done procedurally and move on.

My first thoughts were to use DBD::AnyData, but it doesn't support "advanced" SQL commands such as HAVING and GROUP BY ... shame, but here is my take on this problem with that module. Note that i don't quite fully understand what you are exactly trying to do with your counting (i didn't bother with 'new' and 'unique' hits), but this should serve as a starting point should you wish to explore DBD::AnyData. Also note that i changed your file from space delimeted to tab delimted and named the file data.txt and replaced all instances of agnecy to agency - was that typo intentional? You also had once agency listed as null - occording to your desired output, i guesstimated that the agency was number 2.

use strict;
use warnings;
use DBI;
use Data::Dumper;

my $dbh = DBI->connect('dbi:AnyData(RaiseError=>1):');
$dbh->func(
   qw(log Tab data.txt),
   { col_names => 'date,user,agency,url,garbage,success,module' },
   'ad_catalog',
);

my %pass = fetch_rows('pass');
my %fail = fetch_rows('fail');

print join("\t",qw(Agency Url Pass Fail Module)),"\n";

# loop thru @pass, try to fetch from @fail
# i'll let you decide how to sort ;)
for my $key (keys %pass) {
   my ($agency,$url,$module) = split(':',$key);
   my $pass = $pass{$key};
   my $fail = $fail{$key} || 0;
   print join("\t",$agency,$url,$pass,$fail,$module),"\n";
}

# trick here is to append agency, url, and module 
# so they are treated as one unique entity
# too bad you can't use unique with AnyDBD ...
# watch out, i use a colon as the delimter - YMMV
sub fetch_rows {
   my $success = shift;
   my $sth = $dbh->prepare('
      SELECT agency, url, module
      FROM log WHERE success = ?
   ');
   $sth->execute($success);

   my %hash;
   $hash{ join(':', $_->{agency}, $_->{url}, $_->{module}) }++
      while $_ = $sth->fetchrow_hashref;

   return %hash;
}
[download]

And here is the output on your data file (with said 'corrections'): Agency Url Pass Fail Module agency2 url1 2 1 mod1 agency1 url1 3 1 mod1 agency1 url1 1 0 mod2 agency2 url2 1 0 mod2 UPDATE: on second thought ... just listen to BrowserUk. :) The important concept is getting the right unique rows, and DBD::AnyDB is probably overkill for this problem, only because it doesn't handle COUNT and GROUP BY. (It's still a fabulous module for data converstions though.)

jeffa

L-LL-L--L-LL-L--L-LL-L--
-R--R-RR-R--R-RR-R--R-RR
B--B--B--B--B--B--B--B--
H---H---H---H---H---H---
(the triplet paradiddle with high-hat)

Comment on (jeffa) Re: nested hashes or object oriented Download Code