This is what the source data looks like.
Element 1 - date
Element 2 - user
Element 3 - agency
Element 4 - url
Element 5 - garbage
Element 6 - type (pass, fail, new...)
Element 7 - module
Element 8 - more garbage
05/Jun/2003:00:01:23 user1 agency1 url1 garbage pass mod1 more_garbage
05/Jun/2003:00:03:17 user2 null url1 garbage fail mod1 more_garbage
05/Jun/2003:00:03:42 user1 agency1 url1 garbage pass mod1 more_garbage
05/Jun/2003:00:05:03 user6 agnecy2 url1 garbage pass mod1 more_garbage
05/Jun/2003:00:08:34 user3 agnecy2 url1 garbage pass mod1 more_garbage
05/Jun/2003:00:11:59 user4 agency2 url2 garbage fail mod1 more_garbage
05/Jun/2003:00:14:30 user5 agnecy2 url1 garbage new_a mod1 more_garbage
05/Jun/2003:00:15:02 user1 agency1 url1 garbage pass mod1 more_garbage
05/Jun/2003:00:16:56 user7 agency2 url2 garbage pass mod2 more_garbage
05/Jun/2003:00:17:31 user1 agency1 url1 garbage fail mod1 more_garbage
05/Jun/2003:00:17:31 user1 agency1 url1 garbage pass mod2 more_garbage
There are six of these log files generated each day. What I need to get out of them is this:
Agency URL Pass Fail New Unique Module
agency1 url1 3 1 0 1 mod1
agnecy1 url1 1 0 0 1 mod2
agency2 url1 2 1 1 3 mod1
agency2 url2 1 0 0 1 mod2
and so on. As you can see from this example, a failure doesn't count towards the unique number - the unique is only the number of successfully logins. The pass count represents
the total number of logins - thus caputuring reoccuring logins and so forth. I am sure that you get what I am getting at. Stats at how and what is being used.
Once I get these lines of stats, I can then throw the results into a database for more advanced querying and storage. This part I am good time go on. I am just having
some trouble gathering the stats in the form that I want/need.
I have tried using nested hashes like
my (%agency,(%url,(%module,(%type,$type_count))));
my $pass_count=0;
my $fail_count=0;
my $new_count=0;
while ($line=<FILE>) {
($e1,$e2,$e3,$e4,$e5,$e6,$e7,$e8)=(split/\s+/,$line);
$pf=substr($pf,0,4);
if ($pf eq "pass") {
$agency{$e4,{$e7,{$e6,$pass_count++}}};
if ($pf eq "fail) {
$agency{$e4,{$e7,{$e6,$fail_count++}}};
if ($pf eq "new_") {
$agency{$e4,{$e7,{$e6,$new_count++}}};
This didn't seem to work. Part of this may be due to the fact that I can't figure out how to print the results, the other is that I am not convinced that the hashes are getting properly populated.
I then tried an object oriented way
package Agency;
sub new {
my $class={};
$class-> {agency}=undef;
$class->{url}=undef;
$class->{pass}=undef;
$class->{fail}=undef;
$class->{new}=undef;
$class->{module}=undef;
bless $class;
return $class;
}
sub init {
my $class=shift;
$class->{agency}=shift;
$class->{url}=shift;
$class->{pass}=shift;
$class->{fail}=shift;
$class->{new}=shift;
$class->{module}=shift;
}
sub display {
my $class=shift;
print "Agency: $class->{agency} URL: $class->{url} Pass: $class-
+>{pass} Fail: $class{fail} New: $class{new} Module: $class->{modu
+le}\n"
}
package main;
my $new_agency=Agency->new();
foreach $file (glob('file_name*.txt')) {
open (FILE,$file)||die ("unable to open $file\n");
print "working on $file\n";
while ($line=<FILE>) {
($ts,$w,$agency,$url,$x,$pf,$module,$z)=(split/\s+/,$line);
$pf=substr($pf,0,4);
if ($pf eq "pass") {
$new_agency->init($agency,$url,$pass_count++,$fail_coun
+t,$new_count,$module);
}
elsif ($pf eq "fail") {
$new_agency->init($agency,$url,$pass_count,$fail_count+
++,$new_count,$module);
}
elsif ($pf eq "new_") {
$new_agency->init($agency,$url,$pass_count,$fail_count,
+$new_count++,$module);
}
} #while
} # foreach
$new_agency->display();
This didn't work either, it just gave me the agency, and module information for the last line in the last file read - with one counter
being properly incremented. I tried doing the display inside the loop but this didn't help. I am guessing the problem lies somewhere
with the fact that I am using "init" in each case.
Despite the fact that everything else that I have tried has failed, I think that one of these two options seems the most legitimate. So I am now
asking for both advice and opinions on what the best method for obtaining this result set would be.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.