Re: Redirection with LWPUserAgent
by ikegami (Patriarch) on Dec 18, 2011 at 11:09 UTC
|
Your question makes no sense. LWP::UserAgent is an HTTP client, it doesn't talk to other HTTP clients, and it does follow redirections. | [reply] |
|
|
| [reply] |
|
|
kazak:
It looks like HTTP::Proxy will be able to help you with that.
...roboticus
When your only tool is a hammer, all problems look like your thumb.
| [reply] |
|
|
| [reply] |
|
|
| [reply] |
Re: Redirection with LWPUserAgent
by CountZero (Bishop) on Dec 18, 2011 at 13:54 UTC
|
HTTP status message 302 is "Found - The requested page has moved temporarily to a new url".Why would you want to change that in some cases? And how would you know which redirections are bad and which are good?
CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James
| [reply] |
|
|
| [reply] |
|
|
Yes, I think HTTP::Proxy can do that. Your users connect to the server side of HTTP::Proxy and the client side then receives and filters the reply and the server side issues a redirection to another search engine if the reply is 404 or 410. All other status messages (including the redirection messages) should be passed along "as is" in my opinion.However there is a hidden snag in your idea: the HTTP protocol is stateless, so when the user clicks on one of the search engine's results and gets a "bad" result, you want him to redirect to another search engine with the original search request, but by that time the server (HTTP::Proxy) has all forgotten about that original search request. You will have to implement some form of session management or perhaps use a short living cookie to maintain state.
CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James
| [reply] |
|
|
|
|
|
|
|
my $response = $ua->get('http://search.cpan.org/');
if ($response->is_success) {
print $response->decoded_content; # or whatever
}
else {
die $response->status_line;
}
HTTP status codes tells you that 4xx codes mean "it didn't work". 404 is "not found". And $response->is_success will be false. So it is easy to tell if a URL actually lead to somewhere.
The LWP will follow the 3xx redirection status codes up to the default of 7 links (the LWP page above tells you that too). This behavior can be changed but it sounds like you don't want to do that.
I am no expert on LWP, but I've written a few of these things - the complications and weird things that can happen are legion. I will venture to say that your chances of success are approximately 0% if you don't write some experimental code on your own to "play around" before you tackle your overall project given that it sounds pretty ambitious. I would write a basic framework and get that working before tackling the boundary cases and tricky handling of response codes!
When you encounter a problem (and you will), try a lot of experiments and then post some code, preferably something simple (subset of your actual code and use some URL where we can run your code to replicate your results). | [reply] [d/l] |