Well my hope is that the you would cron this script and run it every hour for say a month before starting to do some analysis. While noting the number of users on the site doesn't neccesarily provide a complete model of site load, over a month my guess is that there is a strong correlation between the number of users recorded and the time it takes to load. When creating your graphs etc. at the end of the month you could eliminate outliers. If you notice certain times where the load time is always high (i.e. a certain time when the database is being backuped), you could eliminate those times too. Over a long enough period of time you should come out with some good data. As far a s personal nodelets go, since I am loading the default frontpage, which will (almost) always have the same nodelet configuration, this and other personalization issues arent really relevant. Even if I was loading my personal frontpage, as long as I didnt make significant changes to the configuration, this program should provide you with good data. One thing that I didnt account for is the size of the page loaded. Obviously the larger the page, the longer it will take to load regardless of users, so a way to compensate for that (and for users who can't saturate the PM bandwidth) is record the size of $url each time you calculate load time and when you are doing analysis, throw out extremely large frontpages etc.
Here is a new mySQL table:
create table web_load (
date datetime not null,
url text not null,
load_secs float unsigned not null,
size int unsigned not null,
users int unsigned not null);
and new code which takes size into account:
#!/usr/bin/perl -w
use DBI;
use LWP::Simple;
use HTTP::Size;
use Time::HiRes qw(gettimeofday tv_interval);
use Getopt::Std;
####commandline config
my (%options);
getopts("w:u:p:h", \%options);
if ($options{h}) {
print <<'eof';
-w webpage: Webpage to fetch
-u username: Username for mysql
-p password: Password for mysql
-h: This help file
eof
exit;
}
####config
my ($db_insert_time) = qq{INSERT web_load (date, url, load_secs, s
+ize, users) VALUES(now(),?,?,?,?)}; #sql insert
my ($url) = $options{w} || 'http://www.perlmonks.org/index.pl?node
+_id=131'; #default webpage (perlmonks frontpage)
my ($db_user_name) = $options{u} || ''; #default mysql username
my ($db_password) = $options{p} || ''; #defualt mysql password
my ($db_database) = 'DBI:mysql:website'; #default mysql database
####connect to db
my ($DBH) = DBI->connect ($db_database, $db_user_name, $db_passwor
+d, { RaiseError => 1 });
####record start time, get frontpage, and calculate elapsed time
my ($start_secs);
$start_secs = [gettimeofday];
my ($content) = get($url);
die ("Couldn't GET $url\n") unless defined $content;
my ($load_secs) = tv_interval ($start_secs);
####extract users from $content and do some error checking (only for p
+erlmonks)
my ($users) = $content =~ /\((\d+?)\)<br \/><a HREF=/;
die ("Couldn't extract users from $url\n") unless defined $users;
####calculate size of $url (if someone knows a better way to do this p
+lease tell me)
my ($size) = HTTP::Size::get_size( $url );
die ("Couldn't get size of $url\n") unless defined $size;
####insert users and load_secs into database
my ($STH) = $DBH->prepare($db_insert_time);
$STH->execute($url, load_secs, $size, $users);
####database finish
$STH->finish();
$DBH->disconnect();
|