comment on

This is what the source data looks like.
Element 1 - date
Element 2 - user
Element 3 - agency
Element 4 - url
Element 5 - garbage
Element 6 - type (pass, fail, new...)
Element 7 - module
Element 8 - more garbage

05/Jun/2003:00:01:23 user1 agency1 url1 garbage pass mod1 more_garbage
05/Jun/2003:00:03:17 user2 null url1 garbage fail mod1 more_garbage
05/Jun/2003:00:03:42 user1 agency1 url1 garbage pass mod1 more_garbage
05/Jun/2003:00:05:03 user6 agnecy2 url1 garbage pass mod1 more_garbage
05/Jun/2003:00:08:34 user3 agnecy2 url1 garbage pass mod1 more_garbage
05/Jun/2003:00:11:59 user4 agency2 url2 garbage fail mod1 more_garbage
05/Jun/2003:00:14:30 user5 agnecy2 url1 garbage new_a mod1 more_garbage
05/Jun/2003:00:15:02 user1 agency1 url1 garbage pass mod1 more_garbage
05/Jun/2003:00:16:56 user7 agency2 url2 garbage pass mod2 more_garbage
05/Jun/2003:00:17:31 user1 agency1 url1 garbage fail mod1 more_garbage
05/Jun/2003:00:17:31 user1 agency1 url1 garbage pass mod2 more_garbage

There are six of these log files generated each day. What I need to get out of them is this:

Agency URL Pass Fail New Unique Module
agency1 url1 3 1 0 1 mod1
agnecy1 url1 1 0 0 1 mod2
agency2 url1 2 1 1 3 mod1
agency2 url2 1 0 0 1 mod2

and so on. As you can see from this example, a failure doesn't count towards the unique number - the unique is only the number of successfully logins. The pass count represents the total number of logins - thus caputuring reoccuring logins and so forth. I am sure that you get what I am getting at. Stats at how and what is being used.

Once I get these lines of stats, I can then throw the results into a database for more advanced querying and storage. This part I am good time go on. I am just having some trouble gathering the stats in the form that I want/need.

I have tried using nested hashes like

my (%agency,(%url,(%module,(%type,$type_count))));
my $pass_count=0;
my $fail_count=0;
my $new_count=0;

    while ($line=<FILE>) {
        ($e1,$e2,$e3,$e4,$e5,$e6,$e7,$e8)=(split/\s+/,$line);
        $pf=substr($pf,0,4);  
         if ($pf eq "pass") {
             $agency{$e4,{$e7,{$e6,$pass_count++}}};
        if ($pf eq "fail) {
                     $agency{$e4,{$e7,{$e6,$fail_count++}}};
        if ($pf eq "new_") {
                     $agency{$e4,{$e7,{$e6,$new_count++}}};
[download]

This didn't seem to work. Part of this may be due to the fact that I can't figure out how to print the results, the other is that I am not convinced that the hashes are getting properly populated.

I then tried an object oriented way

package Agency;

sub new {
    my $class={};

    $class-> {agency}=undef;
    $class->{url}=undef;
    $class->{pass}=undef;
    $class->{fail}=undef;
    $class->{new}=undef;
    $class->{module}=undef;

    bless $class;
    return $class;
}

sub init {

    my $class=shift;
    $class->{agency}=shift;
    $class->{url}=shift;
    $class->{pass}=shift;
    $class->{fail}=shift;
    $class->{new}=shift;
    $class->{module}=shift;
}

sub display {

    my $class=shift;
    print "Agency: $class->{agency}  URL: $class->{url}  Pass: $class-
+>{pass}  Fail: $class{fail}  New: $class{new}  Module:  $class->{modu
+le}\n"
}

package main;

my $new_agency=Agency->new();

foreach $file (glob('file_name*.txt')) {
    open (FILE,$file)||die ("unable to open $file\n");
    print "working on $file\n";
    while ($line=<FILE>) {
        ($ts,$w,$agency,$url,$x,$pf,$module,$z)=(split/\s+/,$line);
        $pf=substr($pf,0,4);  
           if ($pf eq "pass") {
               $new_agency->init($agency,$url,$pass_count++,$fail_coun
+t,$new_count,$module);
           }
           elsif ($pf eq "fail") {
               $new_agency->init($agency,$url,$pass_count,$fail_count+
++,$new_count,$module);
           }
           elsif ($pf eq "new_") {
               $new_agency->init($agency,$url,$pass_count,$fail_count,
+$new_count++,$module);
           }
    } #while
} # foreach

$new_agency->display();
[download]

This didn't work either, it just gave me the agency, and module information for the last line in the last file read - with one counter being properly incremented. I tried doing the display inside the loop but this didn't help. I am guessing the problem lies somewhere with the fact that I am using "init" in each case.

Despite the fact that everything else that I have tried has failed, I think that one of these two options seems the most legitimate. So I am now asking for both advice and opinions on what the best method for obtaining this result set would be.

In reply to nested hashes or object oriented by ctaustin

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.