Re: log parsing very slow

Update: Corrected the RE.

First thing to note: Use a hash for counting.

foreach (@count) {
    if ($_->[0] eq $bn) {
        $_->[1]++;
        $found = 1;
        last;
    }
}

if ($found == 0) {
    push @count, [ $bn, 1 ];
}
[download]

becomes

++$count{$bn};
[download]

Next thing: Precompile your RE and do it for all masks:

my $regex = join '|', @mask;
# Now you have a list of alternatives
$regex = qr<GET (.*\b(?:$regex)\b.*) HTTP/1.1" 200 [0-9].*>;
# now the regex is precompiled
[download]

(Untested!)

Next thing: Why not replace while searchin. So

if (/$regex/) {
    s/.*GET //;
    s/ HTTP.*//;
      :
      :
[download]

is reduced to

if (s/$regex/$1/) {
      :
      :
[download]

Putting it all together and eliminating the unneccessary replace and chomp will give you this script, that should (UNTESTED) give the same result as yours, except for the sequence of filenames which will be sorted:

use strict;
use File::Basename;

my %count;
my @mask = (
        '/some/path/',
        'some/other/path',
        'another/path',
);

my $regex = join '|', @mask;
# Now you have a list of alternatives $regex = qr<GET (.*\b(?:$regex)\
+b.*) HTTP/1.1" 200 [0-9].*> 
# now the regex is precompiled
# N.B. I added \b to make the filename only match on word boundaries.
# So some/path won't match anothersome/path
# but it will match another/some/path

open F, "access.log";

while (<F>) {
        if (/$regex/) {
                ++$count{basename($1)};
        }
}

foreach (sort keys %count) {
        print $_," = ",$count{$_},"\n";
}
[download]

$\=~s;s*.*;q^|D9JYJ^^qq^\//\\\///^;ex;print

Comment on Re: log parsing very slow Select or Download Code