in reply to Sneeky Snake
Works fine, although it's definitely in the "one shot" category... any major changes to the web page format will break this program. Although using HTML::TableExtract is a better overall solution, throwing a hack like this together only takes a few minutes. It's an (easy) example of the general idea of loading in a web page and sucking out the bits that you're interested in.#!/usr/bin/perl -w use strict; use LWP::Simple; my $page = get ("http://setiathome.ssl.berkeley.edu/stats/country_7.ht +ml"); my @lines = split /<tr>/, $page; for (@lines) { # Take out the links (for the lines that have 'em) s/<a.*?>(.*)<\/a>/$1/g; # Take out the silly s/ //g; # Match the 2 parts you want m/<td>(.*?)<\/td>.*?(\d+)/isg; # And print it print "$1 : $2\n"; }
Gary Blackburn
Trained Killer
|
|---|