comment on

"Once I get these lines of stats, I can then throw the results into a database for more advanced querying and storage."

Seems to me that this is advanced querying. :) SQL has some handy tools to group and count. Anyways, since you are going to use a database later on, sounds like this script is a throw away. If so, OO is waaay overkill. Just get the job done procedurally and move on.

My first thoughts were to use DBD::AnyData, but it doesn't support "advanced" SQL commands such as HAVING and GROUP BY ... shame, but here is my take on this problem with that module. Note that i don't quite fully understand what you are exactly trying to do with your counting (i didn't bother with 'new' and 'unique' hits), but this should serve as a starting point should you wish to explore DBD::AnyData. Also note that i changed your file from space delimeted to tab delimted and named the file data.txt and replaced all instances of agnecy to agency - was that typo intentional? You also had once agency listed as null - occording to your desired output, i guesstimated that the agency was number 2.

use strict;
use warnings;
use DBI;
use Data::Dumper;

my $dbh = DBI->connect('dbi:AnyData(RaiseError=>1):');
$dbh->func(
   qw(log Tab data.txt),
   { col_names => 'date,user,agency,url,garbage,success,module' },
   'ad_catalog',
);

my %pass = fetch_rows('pass');
my %fail = fetch_rows('fail');

print join("\t",qw(Agency Url Pass Fail Module)),"\n";

# loop thru @pass, try to fetch from @fail
# i'll let you decide how to sort ;)
for my $key (keys %pass) {
   my ($agency,$url,$module) = split(':',$key);
   my $pass = $pass{$key};
   my $fail = $fail{$key} || 0;
   print join("\t",$agency,$url,$pass,$fail,$module),"\n";
}

# trick here is to append agency, url, and module 
# so they are treated as one unique entity
# too bad you can't use unique with AnyDBD ...
# watch out, i use a colon as the delimter - YMMV
sub fetch_rows {
   my $success = shift;
   my $sth = $dbh->prepare('
      SELECT agency, url, module
      FROM log WHERE success = ?
   ');
   $sth->execute($success);

   my %hash;
   $hash{ join(':', $_->{agency}, $_->{url}, $_->{module}) }++
      while $_ = $sth->fetchrow_hashref;

   return %hash;
}
[download]

And here is the output on your data file (with said 'corrections'): Agency Url Pass Fail Module agency2 url1 2 1 mod1 agency1 url1 3 1 mod1 agency1 url1 1 0 mod2 agency2 url2 1 0 mod2 UPDATE: on second thought ... just listen to BrowserUk. :) The important concept is getting the right unique rows, and DBD::AnyDB is probably overkill for this problem, only because it doesn't handle COUNT and GROUP BY. (It's still a fabulous module for data converstions though.)

jeffa

L-LL-L--L-LL-L--L-LL-L--
-R--R-RR-R--R-RR-R--R-RR
B--B--B--B--B--B--B--B--
H---H---H---H---H---H---
(the triplet paradiddle with high-hat)

In reply to (jeffa) Re: nested hashes or object oriented by jeffa
in thread nested hashes or object oriented by ctaustin

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.