in reply to Grabbing a hundred pages

LWP::Simple should be your first choice as noted above. However, it is somewhat limiting and as your are getting > 100 pages you may appreciate the ability to set timeouts, etc.

So, only a little more complicated than LWP::Simple:

use LWP::UserAgent; use HTTP::Request::Common qw(GET); my $ua = LWP::UserAgent->new; my $response = $ua->request(GET $rdf->{rdf_url}); my $html = $response->content;
And the parse the HTML as described above.

If you do wind up using LWP::Simple, check out the

is_success(mirror($URL, $URL_MIRROR)) { ...
construct so you are only processing html docs you haven't previously processed.