haidut has asked for the wisdom of the Perl Monks concerning the following question:
I have a question involving redirection of URLs. Suppose I am using LWP::UserAgent to send a GET/HEAD request to a URL. But that URL1 may be forwarding to URL2, which may be forwarding to URL3 and so on. So after several redirections we are hitting the final URLn. I want to be able to find out what that final URLn is. Does LWP::UserAgent or any other Perl module keep track of this information? Currently what I am doing is this:
******************************************************************************************************************my $ua = LWP::UserAgent->new; $ua->timeout(5); my $request = HTTP::Request->new(HEAD => $URL1); my $response = $ua->request($request); print $response->base, "\n";
This works in most cases and returns the final URLn but sometimes it doesn't work. For example, suppose the final URLn is http://cnn.com/technology/internet/google.html
When I call the $response->base method, I am getting back the URL "cnn.com" rather than the full URL as displayed above. So some sites report the base URL to be the top-level domain and not the actual URL containing the data. On the other hand, LWP::UserAgent must clearly know what the full final URL is b/c it has to fetch the data from it. Is that information on the final URLn stored anywhere into the HTTP::Response object or HTTP::Headers object return as part of the usage of LWP::UserAgent?
Any help will be greatly appreciated.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: URL redirection and final URL
by ikegami (Patriarch) on Sep 08, 2008 at 21:54 UTC | |
by haidut (Novice) on Sep 10, 2008 at 02:14 UTC |