This is a proxy server that will record all your web browsing activity. It's based on cool NetServer::Generic, which lets you focus on the core of your server app, and leads to compact servers. Performancewise it may have a slight effect on the response time of your browser, but it's lightweight enough not to be noticed if you have a fast machine. It also takes advantage of Proc::Daemon to do all those things decent daemons should, like detaching themselves from controlling terminals.

Usage: First configure it by changing the variables on top to suit your browsing habits and bandwidth. $pageview_range should be a little longer than the average time it takes your browser to issue requests for all component files of a page view. $per_page_time is the average time you spend on a page, for the program to give you a simple approximation of the time you spend on the web. $listen_port is the port you want your proxy to listen on and $logfile should be the path to the logfile where your web browsing activity is to be recorded.

If you run this proxy on your own machine you should configure your browser to use a proxy on localhost and the port you configured the program with.

After doing this just continue your happy browsing and when you're curious about how much time you spend on the web (and on any particular site you visit) just go to the http://stats url and you'll get a nice report from the proxy.

It works under Linux, and may also run on NT. This is my first cut at it and I haven't tested it exhaustively. Please provide any comments you may have on functionality or style.

Warning:This program could be a very nasty thing to use on coworkers, but may be of great help in monitoring a child's use of the web.

Update: I found out that my program will fill the process table with zombie processes. Looks like a bug in NetServer::Generic, since the ones I fork are properly ignored by the parent and do not remain.

Update: Wow.. It turns out that NetServer::Generic indeed has a bug. If you want to fix it yourself then go to the source (file Generic.pm) and replace all lines $SIG{CHLD} = &reap_child(); for $SIG{CHLD} = \&reap_child;. I'm using v1.02, which is the most recent. I'll submit a patch to the author.

#!/usr/bin/perl -w use strict; use NetServer::Generic; use Proc::Daemon; my $listen_port = 8080; my $logfile = '/tmp/proxy_log'; my $pageview_range = 20; # seconds my $per_page_time = 2; # minutes my $server_cb = sub { my ($s) = shift ; my $line1 = <STDIN>; unless($line1 =~ m[(\w+)\s+http://([^/:]+)(:(\d+))?(\S*)\s+(\S+)]) { print STDOUT "HTTP/1.0 400 Bad Request\nConnection: close\n\n"; print STDOUT "HTTP/1.0 400 Bad Request\n"; return; } my ($method, $serv, $port, $path, $version) = ($1, $2, $4, $5, $6); if($serv !~ /stats/) { my $sock = IO::Socket::INET->new(PeerAddr => $serv, PeerPort => $port || 80, Proto => 'tcp'); print $sock "$method $path $version\n"; print $sock "Connection: close\n"; $SIG{CHLD} = 'IGNORE'; if(my $pid = fork) { while(<STDIN>) { print $sock $_; } } else { while(<$sock>){ print STDOUT $_; } } } else { my $stats = &getStats(); print STDOUT "HTTP/1.1 200 OK\nContent-type: text/plain\n"; print STDOUT "Connection: close\n\n"; print STDOUT "Your Browsing Stats!\n\n"; print STDOUT "$stats->{DAY} page views in the last day\n"; print STDOUT "$stats->{WEEK} page views in the last week\n"; print STDOUT "$stats->{MONTH} page views in the last month\n"; print STDOUT "$stats->{YEAR} page views in the last year\n\n"; my $avg_time = ($stats->{MONTH} / 30) * $per_page_time; print STDOUT "At $per_page_time minutes per page that's $avg_time +minutes per day in the last month.\n\n"; print STDOUT "Your favorite sites:\n\n"; foreach(map {$_->[0]} sort{$b->[1] <=> $a->[1]} map{[$_, $stats->{BY_SERVER}{$_}{TOTAL}]} (keys %{$stats->{BY_SERVER}})) { print STDOUT "$_\n"; print STDOUT "--------------------------------------------\n"; print STDOUT $stats->{BY_SERVER}{$_}{DAY} || 0, " page views in the last day\n"; print STDOUT $stats->{BY_SERVER}{$_}{WEEK} || 0, " page views in the last week\n"; print STDOUT $stats->{BY_SERVER}{$_}{MONTH} || 0, " page views in the last month\n"; print STDOUT $stats->{BY_SERVER}{$_}{YEAR} || 0, " page views in the last year\n\n"; } } open LOG, ">>$logfile" or die "could not open $logfile"; print LOG $serv, ' ', time, "\n"; close LOG; }; my ($foo) = new NetServer::Generic; $foo->port($listen_port); $foo->callback($server_cb); $foo->mode('forking'); print "Starting server\n"; &Proc::Daemon::Init(); $foo->run(); sub getStats { my $day_ago = time - 60 * 60 * 24; my $week_ago = time - 60 * 60 * 24 * 7; my $month_ago = time - 60 * 60 * 24 * 7 * 30; my $year_ago = time - 60 * 60 * 24 * 7 * 30 * 12; my($serv, $time, %hits); open LOG, "$logfile" or die "could not open $logfile"; while(<LOG>) { ($serv, $time) = split; if($time - $hits{BY_SERVER}{$serv}{LAST} > $pageview_range) { $hits{BY_SERVER}{$serv}{HITS}{$time} = 1; $hits{BY_SERVER}{$serv}{LAST} = $time; $hits{BY_SERVER}{$serv}{TOTAL}++; if($day_ago < $time) { $hits{BY_SERVER}{$serv}{DAY}++; $hits{BY_SERVER}{$serv}{WEEK}++; $hits{BY_SERVER}{$serv}{MONTH}++; $hits{BY_SERVER}{$serv}{YEAR}++; $hits{DAY}++; $hits{WEEK}++; $hits{MONTH}++; $hits{YEAR}++; } elsif ($week_ago < $time) { $hits{BY_SERVER}{$serv}{WEEK}++; $hits{BY_SERVER}{$serv}{MONTH}++; $hits{BY_SERVER}{$serv}{YEAR}++; $hits{WEEK}++; $hits{MONTH}++; $hits{YEAR}++; } elsif ($month_ago < $time) { $hits{BY_SERVER}{$serv}{MONTH}++; $hits{BY_SERVER}{$serv}{YEAR}++; $hits{MONTH}++; $hits{YEAR}++; } elsif ($year_ago < $time) { $hits{BY_SERVER}{$serv}{YEAR}++; $hits{YEAR}++; } } } close LOG; \%hits; }

In reply to How much time do you spend at Perlmonks? (personal web proxy) by gregorovius

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.