Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:
The Squid log entries look like this (Yes, these are real entries):use strict; use warnings; use File::Basename; use File::stat; use File::Find; use Cwd; my ($root) = getcwd =~ /(.*)/; my $total; find( { untaint_pattern=>'.*', no_chdir => 1, wanted => sub { return unless /MyFoo.*\z/; my $v_snap_file = $File::Find::name; my $basefile = basename($v_snap_file); # I know this is evil, it's a hack. my $count = `/bin/grep $basefile /var/log/squid/access.log | /usr/bin/wc -l`; $count =~ s/^\s+//g; my $v_sb = stat("$v_snap_file"); my $v_filesize = $v_sb->size; my $v_bprecise = sprintf "%.0f", ($v_filesize); my $v_bsize = insert_commas($v_bprecise); my $v_kprecise = sprintf "%.0f", ($v_filesize/1024); my $v_ksize = insert_commas($v_kprecise); my $v_filedate = scalar localtime $v_sb->mtime; my $basename_v = basename($v_snap_file); print "File Name..: $basename_v\n"; print "File Size..: $v_bsize bytes ($v_ksize kb)\n"; print "Downloads..: ", insert_commas($count); my $tbytes = $v_filesize * $count; print "Total bytes: ", insert_commas($tbytes), "\n\n"; $total += $tbytes; } }, $root); print "\n", "-"x40, "\n"; print "Final total bytes: ", insert_commas($total), "\n\n"; sub insert_commas { my $text = reverse $_[0]; $text =~ s/(\d{3})(?=\d)(?!\d*\.)/$1,/g; return scalar reverse $text; }
The numeric value right before the "TCP_MISS:DIRECT" is the file size. Notice that this generated two hits for what basically is one download. The real final file size for 'file.zip' is 8224380 bytes; just a little over 8 megs.wdcsun28.usdoj.gov - - [07/Aug/2003:04:58:15 -0700] "GET http://dl.dom +ain.org/MyFoo-file.zip HTTP/1.0" 200 1607158 TCP_MISS:DIRECT wdcsun28.usdoj.gov - - [07/Aug/2003:05:03:33 -0700] "GET http://dl.dom +ain.org/MyFoo-file.zip HTTP/1.0" 200 8224380 TCP_MISS:DIRECT
When I count these hits in the logs, and generate the stats for the number of bytes downloaded, I'd like to ignore the ones that are not "full" file downloads, by looking at that file size.
Any ideas how I can do this? The code above works, it just counts ALL hits in the logs, not "completed" hits in the logs. Did that make sense?
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: File download statistics parsing
by BrowserUk (Patriarch) on Aug 07, 2003 at 12:59 UTC | |
|
Re: File download statistics parsing
by dda (Friar) on Aug 07, 2003 at 12:47 UTC | |
by Anonymous Monk on Aug 07, 2003 at 13:03 UTC | |
by esh (Pilgrim) on Aug 07, 2003 at 20:57 UTC | |
|
Re: File download statistics parsing
by l2kashe (Deacon) on Aug 07, 2003 at 14:48 UTC | |
|
Re: File download statistics parsing
by bean (Monk) on Aug 07, 2003 at 22:38 UTC | |
by bean (Monk) on Aug 07, 2003 at 22:58 UTC |