I have two apache logs I am trying to compare Date, IP, & User Agent data to find matches. I am using Apache::LogRegex to load all of the data into a hash. I am pretty rusty on my perl and trying to see the best way.
I have 3 log files that have GET requests in them, and 3 log files that have POST requests in them. I want to do the compare against the fields I listed above (Date, IP, or UserAgent) between the two hashes that contain GET & POST logs to find matches.
So I can read each line, and access the data easily, but I am stuck on how to get the contents of all 6 files into two separate hashes. Each line comes as a hash, so perhaps a hash of hashes for both GET & POST logs, or an array of hashes? Is my push statement below the best way to do it, or do I need to setup some kind of keys so they don't overwrite?
As a side note, what do you think would be the best way to compare fields between two array of hashes? I could come up with some hacky, really process expensive way by doing lots of loops, but I am assuming there is some faster way.
Please help with a little direction on the best way to acoomplish this.
#!/usr/bin/perl -w
use Apache::LogRegex;
my $lr;
my $log_format = '"%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User
+-Agent}i\""';
eval { $lr = Apache::LogRegex->new($log_format) };
die "Unable to parse log line: $@" if ($@);
my $get_logs = ("march-logs/march-bannat.txt",
"march-logs/march-logs-web2/march-bannat.txt",
"march-logs/march-logs-web3/march-bannat.txt");
my $post_logs = ("march-logs/march-post.txt",
"march-logs/march-logs-web2/march-post.txt",
"march-logs/march-logs-web3/march-post.txt");
my %data;
my %getRecords;
my $postRecords;
foreach ($get_logs)
{
my @array = &logToHash($_);
}
sub logToHash
{
my $file = $_;
my %hash;
my @AoH;
open LOG, $file or die $!;
while ( my $line_from_logfile = <LOG> )
{
eval { %data = $lr->parse($line_from_logfile); };
if (%data)
{
push @AoH, %data;
}
}
return @AoH;
}
I noticed when I do a print Dumper(\@array) when the subroutine returns, that it gets a bunch of data, but it prints them like this, key on top of value, instead of like $key => $value. Is this correct? Am I pushing the data incorrectly?
'"%h',
'access_log.9.gz:XX.XX.XX.XX',
'%{Referer}i',
'-',
'%t',
'[19/Mar/2009:02:03:46 -0500]',
'%r',
'GET /02230909 HTTP/1.1',
'%{User-Agent}i\\""',
'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; FunWebPr
+oducts; GTB5; .NET CLR 1.0.3705; .NET CLR 1.1.4322; .NET CLR 2.0.5072
+7)',
'%b',
'20',
'%l',
'-',
'%u',
'-',
'%>s',
'302',
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.