use WWW::Robot; print "Please input the URL of the site to be searched \n"; my $url_name = <STDIN>; # The user inputs the URL to be searched #Create an instance of the webcrawler my $web_crawler = new WWW::Robot(NAME => 'My WebCrawler', VERSION => '1.000', USERAGENT => LWP::UserAgent->new, EMAIL => 'aca03lh@sheffield.ac.uk', ); #Below the attributes of the web crawler are set $web_crawler->addHook('invoke-on-all-url', \&invoke_test); $web_crawler->addHook('follow-url-test', \&follow_test); $web_crawler->addHook('invoke-on-contents', \&invoke_contents); # to + be able to get contents from webpages $web_crawler->addHook('add-url-test', \&add_url_test); # if url does +n't exist in array then add for visit $web_crawler->addHook('continue-test', \&continue_test); # to exit l +oop when we run out of URL's to visit sub invoke_contents { my ($webcrawler, $hook, $url, $response, $structure) = @_; our $contents = $structure; #To make the string that has the conte +nts in global } # Start the web crawling $web_crawler->run($url_name); print $contents;
*********************************
My idea is that the user first inputs the website to be processed(i use http://www.sportinglife.com/) and then the $structure variable in "sub invoke_contents" will be made a global variable. I have put a print command to see if it will print the contents so that i know if it works but it doesn't seem to work really. I have a dial-up connection(believe it or not) and i leave it for about 15 minutes and it doesn't print anything although i don't think it would take that long anyway. Any idea what am i doing wrong?Thanks
Edit g0n: added code tags
In reply to Getting a website's content by lampros21_7
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |