I have two apache logs I am trying to compare Date, IP, & User Agent data to find matches. I am using Apache::LogRegex to load all of the data into a hash. I am pretty rusty on my perl and trying to see the best way.

I have 3 log files that have GET requests in them, and 3 log files that have POST requests in them. I want to do the compare against the fields I listed above (Date, IP, or UserAgent) between the two hashes that contain GET & POST logs to find matches.

So I can read each line, and access the data easily, but I am stuck on how to get the contents of all 6 files into two separate hashes. Each line comes as a hash, so perhaps a hash of hashes for both GET & POST logs, or an array of hashes? Is my push statement below the best way to do it, or do I need to setup some kind of keys so they don't overwrite?

As a side note, what do you think would be the best way to compare fields between two array of hashes? I could come up with some hacky, really process expensive way by doing lots of loops, but I am assuming there is some faster way.

Please help with a little direction on the best way to acoomplish this.
#!/usr/bin/perl -w use Apache::LogRegex; my $lr; my $log_format = '"%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User +-Agent}i\""'; eval { $lr = Apache::LogRegex->new($log_format) }; die "Unable to parse log line: $@" if ($@); my $get_logs = ("march-logs/march-bannat.txt", "march-logs/march-logs-web2/march-bannat.txt", "march-logs/march-logs-web3/march-bannat.txt"); my $post_logs = ("march-logs/march-post.txt", "march-logs/march-logs-web2/march-post.txt", "march-logs/march-logs-web3/march-post.txt"); my %data; my %getRecords; my $postRecords; foreach ($get_logs) { my @array = &logToHash($_); } sub logToHash { my $file = $_; my %hash; my @AoH; open LOG, $file or die $!; while ( my $line_from_logfile = <LOG> ) { eval { %data = $lr->parse($line_from_logfile); }; if (%data) { push @AoH, %data; } } return @AoH; }
I noticed when I do a print Dumper(\@array) when the subroutine returns, that it gets a bunch of data, but it prints them like this, key on top of value, instead of like $key => $value. Is this correct? Am I pushing the data incorrectly?
'"%h', 'access_log.9.gz:XX.XX.XX.XX', '%{Referer}i', '-', '%t', '[19/Mar/2009:02:03:46 -0500]', '%r', 'GET /02230909 HTTP/1.1', '%{User-Agent}i\\""', 'Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; FunWebPr +oducts; GTB5; .NET CLR 1.0.3705; .NET CLR 1.1.4322; .NET CLR 2.0.5072 +7)', '%b', '20', '%l', '-', '%u', '-', '%>s', '302',

In reply to Adding data to hashes & comparing by hallikpapa

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.