Cody Pendant has asked for the wisdom of the Perl Monks concerning the following question:

I want to grab a bit of information from IMDB.

I can do this with LWP::UA by with something like this:

my $ua = LWP::UserAgent->new; my $response = $ua->get('http://imdb.com/find?nm=on;mx=20;q=eliza%20dushku');

and if I do that with a name that exactly matches something in IMDB's database, it will redirect the browser to another URL, in this case http://imdb.com/name/nm0244630/

Now I can parse the information in that page to get what I want, but actually, all I need is this person's IMDB URL, so if I could figure out whether their server returned a Location: header which send the browser to this URL, that would save unneccessary processing.

If this doesn't happen of course, (and the URL remains "http://imdb.com/find?nm=on;mx=20;q=liza%20dushku" because you got her name wrong) I need to do some further processing. But what I really want is something like this pseudocode:

my $response = $ua->get('http://imdb.com/find?nm=on;mx=20;q=eliza%20dushku'); if($response included a relocation to a URL){ store that URL } else { do the more complicated stuff }
Any ideas?


($_='kkvvttuubbooppuuiiffssqqffssmmiibbddllffss')
=~y~b-v~a-z~s; print

Replies are listed 'Best First'.
Re: How to tell if a URL returned a Location: header?
by blokhead (Monsignor) on May 29, 2004 at 01:58 UTC
    In the general case, you can backtrack the response chain using the previous method for HTTP::Response objects. I got a lot of help in this example from the GET script that comes with LWP:
    use LWP::UserAgent; my $ua = LWP::UserAgent->new; my $url = "http://imdb.com/find?nm=on;mx=20;q=eliza%20dushku"; my $response = $ua->get($url); my @chain = ( $response ); while ( $chain[0]->previous ) { unshift @chain, $chain[0]->previous; } for (@chain) { printf "%s %s --> %s\n" => $_->request->method, $_->request->url->as_string, $_->status_line; } __END__ GET http://imdb.com/find?nm=on;mx=20;q=eliza%20dushku --> 302 Found GET http://imdb.com/name/nm0244630/ --> 200 OK
    For your purposes, it's probably simpler to just compare the URL of the final HTTP::Response object with the one you gave to the UserAgent. If they're different, you must have encountered some sort of redirect:
    my $response = $ua->get($url); if ( $response->request->url->as_string ne $url ) { ... }

    blokhead

Re: How to tell if a URL returned a Location: header?
by cees (Curate) on May 29, 2004 at 03:50 UTC

    Instead of using get, create your own HTTP::Request object and call simple_request. From the docs:

    The difference from request() is that simple_request() will not try to handle redirects or authentication responses.

    So only a single request will be made, and you can then call is_redirect on the HTTP::Response object you get back to see if a Location header was returned or if a page was returned. Something like this:

    my $request = HTTP::Request->new(GET => 'http://imdb.com/find?nm=on;mx +=20;q=eliza%20dushku'); my $ua = LWP::UserAgent->new; my $response = $ua->simple_request($request); if ($response->is_redirect) { # store that URL } else { # do the more complicated stuff }

    - Cees

Re: How to tell if a URL returned a Location: header?
by Cody Pendant (Prior) on May 31, 2004 at 03:37 UTC
    Thank you both for your help, and sorry for the delay in saying so.

    I have long ago processed seven or eight hundred such requests and am in "do more complicated stuff" mode right now.



    ($_='kkvvttuubbooppuuiiffssqqffssmmiibbddllffss')
    =~y~b-v~a-z~s; print