in reply to Using WWW::Mechanize effectively
Here you go–
use strictures; use WWW::Mechanize; use Encode; my $mech = WWW::Mechanize->new( autocheck => undef ); my $start = shift || die "Give a URL!\n"; $mech->get($start); $mech->success or die "Sorry, sucker!\n", $mech->response->as_string; for my $link ( $mech->find_all_links ) { printf "Link\n * text -> %s\n * URI -> %s\n", encode("UTF-8", $link->text) || "na", $link->url_abs; }
perl ~/pm-1098382 http://yahoo.co.jp Link * text -> na * URI -> http://bb.yahoo.co.jp/ Link * text -> 投資家情報 * URI -> http://www.yahoo.co.jp/r/fiv perl ~/pm-1098382 http://en.censor.net.ua Link * text -> na * URI -> http://en.censor.net.ua/favicon.ico Link * text -> "Censor.NET" * URI -> http://en.censor.net.ua/ Link * text -> Яндекс цитирования * URI -> http://yandex.ru/cy?base=0&host=censor.net.ua
You already have WWW::Mechanize::Link objects so you are just kind of mangling them into something odd by attempting to create them. You can see from the output that you will have to filter out JS, <links/> and the like. :P
|
|---|