lampros21_7 has asked for the wisdom of the Perl Monks concerning the following question:

Hi, i have written some code to act like a web crawler but during compilation it gives me a syntax error message. I can't find anything wrong with it so am posting it here:
use WWW::Mechanize; use HTML::TreeBuilder; print "Please input the URL of the site to be searched \n"; my $url_name = <STDIN>; # The user inputs the URL to be searched #Create an instance of the webcrawler our $webcrawler = WWW::Mechanize->new(); our $webcrawler = get($url_name); our @website_links = $webcrawler->links($url_name); # The HTML is stripped off the contents and the text is stored in an +array of strings our $x = 0; our $stripped_html[$x] = $webcrawler( format => "text" ); $x = $x + 1; my @visited_urls = ($url_name); # While the array still has elements(URL's) check the content for lin +ks and strip the HTML while (@website_links) { if ((grep {$_ eq $website_links[0] } @visited_urls) > 0) { # If th +e URL has been visited don't visit again shift @website_links;
It doesn't like the $stripped_html[$x] bit on the 13th line and gives me a syntax error on that. I thought i could use an x variable as a number. What i want to do is then go on a loop and the array will get bigger because my x will increment by 1 every time the loop is done. Any ideas?Thanks

Replies are listed 'Best First'.
Re: Weird syntax error message
by sk (Curate) on Jul 30, 2005 at 00:48 UTC
    stripped_html does not exist. You cannot just declare just one element of an array. Try  our @stripped_html; and then do  $stripped_html[$x] = blah;

    BTW why  our? I would use  my

    -SK

Re: Weird syntax error message
by mifflin (Curate) on Jul 30, 2005 at 00:44 UTC
    You can't create an our (or my) instance of an array element. I think you meant...
    our @stripped_html; $stripped_html[$x] = $webcrawler( format => "text" );
    untested
Re: Weird syntax error message
by holli (Abbot) on Jul 30, 2005 at 09:11 UTC
    Two notes:

    Instead of using @visited_urls you might be better of using %visited_urls, so you can avoid constructs as
    my @visited_urls = ($url_name); ... while (@website_links) { if ((grep {$_ eq $website_links[0] } @visited_urls) > 0)
    and use this instead:
    my %visited_urls = ($url_name => 1); ... while (@website_links) { if ( $visited_urls{$website_links[0]} )
    Also, as we don't see the rest of your code, ensure you have a sleep command in your main loop to avoid hammering the webserver.


    holli, /regexed monk/