cdherold has asked for the wisdom of the Perl Monks concerning the following question:
I am trying to get the links within a redirected url if the url I am using in my code does actually redirect (sometimes it does and and sometimes it doesn't). How do I have perl follow that redirect (if it exists) before it starts trying to extract links? I currently am using LWP::UserAgent and the following code (which right now doesn't follow redirects) ...
$url = "http://this.url.may.redirect.to.url.with.content"; $ua = new LWP::UserAgent; # Set up a callback that collects links my @links = (); sub callback { my($tag, %attr) = @_; return if $tag ne 'a'; push(@links, values %attr); } # Make the parser. $p = HTML::LinkExtor->new(\&callback); # Request document and parse it as it arrives $res = $ua->request(HTTP::Request->new(GET => $url), sub {$p->parse($_[0])}); #Expand all URLs to absolute ones my $base = $res->base; @links = map { $_ = url($_, $base)->abs; } @links; print "<b>Original Links:</b> <p>@links<p>";
Thanks Monks!
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Follow Redirect Before Accessing Web Content
by Corion (Patriarch) on Sep 08, 2007 at 06:51 UTC |