Anonymous Monk,
It seems to me that there should be a module on
CPAN that already knows how to parse whatever web log format you are trying to do. If you can't find something or it doesn't do what you need, it is fairly straight forward to fix your problem - add a %seen cache which will also serve as a counter. This prevents having to search through every previous entry to determine if the existing entry is unique or not.
use strict;
use warnings;
use File::Basename;
my (@count, %seen);
my @mask = qw(/some/path/ some/other/path another/path);
open (FH, '<', "access.log") or die "Unable to open 'access.log' for r
+eading: $!";
while (<FH>) {
chomp;
for my $m (@mask) {
my $regex = "GET.*" . $m . ".*HTTP/1.1\" 200 [0-9].*";
if (/$regex/) {
s/.*GET //;
s/ HTTP.*//;
my $bn = basename($_);
push @count, $bn if ! $seen{$bn}++;
}
}
}
print "$_ = $seen{$_}\n" for @count;
This code is untested but it should work. If you didn't care about preserving the order of the entries you could do away with the array all together.
Update: Following the suggestion of others to use qr// external to the loop which will increase the performance of this solution even more. If you can combine the regexes using Regexp::Assemble, there will be an additional boost.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.