in reply to Web Spider problem
The output is:use strict; use WWW::Mechanize; use HTML::Strip; my($url) = 'http://www.google.co.uk'; my $mech = WWW::Mechanize->new(autocheck =>1); my $hs = HTML::Strip->new(); $mech->agent_alias('Linux Mozilla'); $mech->get($url) or die "Page $url can't be reached"; print "Made it past the url test"; my $page = $mech->content; my $clean_text = $hs->parse( $page ); $hs->eof; print $clean_text;
Which is the page contents with all the HTML stripped out. What are you expecting?Made it past the url test iGoogle | Sign in Web Images News Maps New! Produc +ts Groups Scholar more » Advanced Search Prefe +rences Language Tools Search: the web pages from the UK Advertis +ing Programmes - Business Solutions - About Google - Go to Google.com + ©2007 Google
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Web Spider problem
by bauer1sc (Initiate) on Jul 11, 2007 at 16:48 UTC | |
by rpanman (Scribe) on Jul 11, 2007 at 17:43 UTC |