| Category: | PerlMonks Related Scripts |
| Author/Contact Info | crashtest |
| Description: | Ever wondered if you've seen every single quip that displays on the top of the pages on this site? I was curious, so I put together a little screen-scraper to check. Hey, it was a slow Sunday afternoon!
No command-line arguments for this script, but there are three variables at the top of the code (clearly marked) that provide some crude configuration. $iterations determines how often perlmonks.org is queried, $nice sets how long to pause between HTTP requests, while $status_print_interval configures how often the script pipes up with a quick status report to show it's still alive.
After 100 iterations, I think I've "collected 'em all", but I look forward to running this script every couple of months to see what other witticisms those [id://pmdev]s come up with:
********** RESULTS ********** |
#!/usr/bin/perl
use strict;
use warnings;
use LWP::Simple;
#####################################################################
### Set how many times to run, how long (in seconds) to wait between
### page hits, and how often to print an interim status report.
my ($iterations, $nice, $status_print_interval) = (100, 5, 5);
#####################################################################
my %quips;
$SIG{INT} = sub{ print_results(\%quips); exit(0) };
my $loopcount = 0;
print "Beginning PerlMonks Quip Gatherer...\n";
while($iterations > 0){
my $content = get('http://www.perlmonks.org');
die "Failed to load content!\n" unless defined($content);
extract_quip(\%quips, \$content);
$iterations--;
$loopcount++;
print "Found ", scalar keys %quips,
" quip(s) so far, $iterations iteration(s) left...\n"
if ($loopcount % $status_print_interval == 0);
sleep($nice) if ($iterations);
}
print_results(\%quips);
#####################################################################
### SUBS ###
#####################################################################
sub extract_quip{
my ($quips, $content) = @_;
if ($$content =~ m!<td class="monkquip"[^>]+>(.*?)</td>!s){
my $data = $1;
$data =~ s!\<[^>]*>!!sg;
$data =~ s!\s{2,}!!sg;
$quips->{$data}++;
}
}
sub print_results{
my $quips = shift;
print "\n********** RESULTS **********\n";
while (my ($key, $val) = each(%$quips)){
print "$val time(s): $key\n";
}
}
|
|
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: PerlMonks Quips Gatherer
by davido (Cardinal) on May 09, 2005 at 06:48 UTC | |
|
Re: PerlMonks Quips Gatherer
by cog (Parson) on May 09, 2005 at 09:17 UTC | |
by crashtest (Curate) on May 09, 2005 at 17:03 UTC | |
by cog (Parson) on May 09, 2005 at 17:37 UTC | |
by mrborisguy (Hermit) on May 11, 2005 at 03:19 UTC | |
by tye (Sage) on May 09, 2005 at 22:54 UTC | |
by crashtest (Curate) on May 10, 2005 at 05:35 UTC | |
|
Re: PerlMonks Quips Gatherer
by ysth (Canon) on May 08, 2005 at 23:53 UTC |