comment on

Hello to all Perl monks. I have written some code which is supposed to get the html content and store it in a string by using the WWW::Robot module. The code goes like this:

use WWW::Robot;
 print "Please input the URL of the site to be searched \n";
 
 my $url_name = <STDIN>; # The user inputs the URL to be searched
 
 #Create an instance of the webcrawler
 my $web_crawler = new WWW::Robot(NAME     =>  'My WebCrawler',
                               VERSION  =>  '1.000',
                               USERAGENT => LWP::UserAgent->new,
                               EMAIL    =>  'aca03lh@sheffield.ac.uk',
                               );

                           
  #Below the attributes of the web crawler are set
  $web_crawler->addHook('invoke-on-all-url', \&invoke_test);
  $web_crawler->addHook('follow-url-test', \&follow_test);
  $web_crawler->addHook('invoke-on-contents', \&invoke_contents); # to
+ be able to get contents from webpages
  $web_crawler->addHook('add-url-test', \&add_url_test); # if url does
+n't exist in array then add for visit
  $web_crawler->addHook('continue-test', \&continue_test); # to exit l
+oop when we run out of URL's to visit
  
 
 
 sub invoke_contents {
    my ($webcrawler, $hook, $url, $response, $structure) = @_;
    our $contents = $structure; #To make the string that has the conte
+nts in global 
 }
 

# Start the web crawling
$web_crawler->run($url_name);

print $contents;
[download]

*********************************

My idea is that the user first inputs the website to be processed(i use http://www.sportinglife.com/) and then the $structure variable in "sub invoke_contents" will be made a global variable. I have put a print command to see if it will print the contents so that i know if it works but it doesn't seem to work really. I have a dial-up connection(believe it or not) and i leave it for about 15 minutes and it doesn't print anything although i don't think it would take that long anyway. Any idea what am i doing wrong?Thanks

Edit g0n: added code tags

In reply to Getting a website's content by lampros21_7

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.