mkurtis has asked for the wisdom of the Perl Monks concerning the following question:
Yes I know it's missing the logic for urls, and a hash to store it in, but I need to have a loop to tie the entire thing together, I used to open a file and enclose the whole thing in a while loop, but that was when i was mistakenly thinking that I could read from and append to the same file. Here are the past crawler attempts: Useless use of substr in void context#!/usr/bin/perl -w use LWP::Simple; use HTML::SimpleLinkExtor; use Data::Dumper; use LWP::RobotUA; use HTTP::Response; $_="http://www.frozenhosting.com"; my $ua = LWP::RobotUA->new("theusefulbot", "akurtis3 at yahoo.com"); $ua->delay(10/60); my $content= $ua->get($_); my $extor = HTML::SimpleLinkExtor->new(); $extor->parse($content); my @links=$extor->a; print "start"; foreach $links (@links) { } print $content;
Thanks,
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Writing a Web Crawler
by kvale (Monsignor) on Feb 25, 2004 at 05:23 UTC | |
|
Re: Writing a Web Crawler
by perrin (Chancellor) on Feb 25, 2004 at 21:35 UTC | |
|
Re: Writing a Web Crawler
by petdance (Parson) on Feb 26, 2004 at 02:28 UTC | |
by mkurtis (Scribe) on Feb 26, 2004 at 04:45 UTC | |
by petdance (Parson) on Feb 27, 2004 at 03:05 UTC |