Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Get redirected URL

by xorl (Deacon)
on Aug 10, 2009 at 15:29 UTC ( [id://787356]=perlquestion: print w/replies, xml ) Need Help??

xorl has asked for the wisdom of the Perl Monks concerning the following question:

So we've redone our website. Now I want to make sure all the redirects are going to the correct place. What I want to do is request the old URL and see if the server gives a 301 code with the correct new url.

I found Is it possible to get the redirected URL?, but that didn't seem to have the answer. My script:

use LWP; my $url = "oldurl"; my $ua = LWP::UserAgent->new; my $req = HTTP::Request->new(GET => $url); my $res = $ua->request($req); print $res->status_line;

Now running the above code, it seems to realize the oldurl is redirected and pulls the new url and then gives me a 200 OK for the status line. I need to figure out how to get the 301 status from the url I gave it and not the new url it seems to pull from. Can someone help me?

Thanks in advance.

Brief Update: I know the url in question is being redirected b/c of this:

[xorl@xorlsbox ~/tools]$telnet ourdomainname 80 Trying xxx.xxx.xxx.xxx... Connected to ourdomainanme (xxx.xxx.xxx.xxx). Escape character is '^]'. GET /olddir HTTP/1.1 host: ourdomainname HTTP/1.1 301 Moved Permanently Date: Mon, 10 Aug 2009 17:18:45 GMT Server: Apache/2.0.52 (Red Hat) Accept-Ranges: bytes Set-Cookie: PHPSESSID=c100bc24fda84639acb995ae36a4f8c4; path=/ Expires: Thu, 19 Nov 1981 08:52:00 GMT Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre- +check=0 Pragma: no-cache Set-Cookie: PHPSESSID=17c937ef60bcd4d587ba9f662384292e; path=/ Set-Cookie: PHPSESSID=66db955db7d2e6987d235d8327b0abee; path=/ Location: /somecrazy/new/location/here/ Content-Length: 0 Content-Type: text/html; charset=utf-8 Connection closed by foreign host.
Update 2: Trying WWW::Mechanize now. Will post results. Update 3: WWW::Mechanize worked although the redirect_ok[0] solution did not. Here's the relevant code:
my $mech = WWW::Mechanize->new(); $mech->requests_redirectable([]); $mech->get($url); print $mech->response->code . " " . $mech->response->header("Locatio +n");

Replies are listed 'Best First'.
Re: Get redirected URL
by moritz (Cardinal) on Aug 10, 2009 at 15:48 UTC
    To quote Re: Is it possible to get the redirected URL?
    LWP (or WWW::Mechanize) would let you load Page B and automatically follow the HTTP redirects, with the final URL being available to you in the uri method.

    Doesn't that tell if you were redirected to the correct page?

    You could also use WWW::Mechanize and set $mech->redirect_ok(0), thus not following the redirect and examining the returned header.

Re: Get redirected URL
by JavaFan (Canon) on Aug 10, 2009 at 16:18 UTC
    Looking at the LWP::UserAgent manual page and grepping for redirect gives a couple of options you can give as a parameter for new that may do what you want. Searching for redirect further down gives you a couple of useful handlers you can use. And even further down, we stumble upon:
    $ua->simple_request( $request ) $ua->simple_request( $request, $content_file ) $ua->simple_request( $request, $content_cb ) $ua->simple_request( $request, $content_cb, $read_size_hint ) This method dispatches a single request and returns the res +ponse received. Arguments are the same as for request() describe +d above. The difference from request() is that simple_request() will + not try to handle redirects or authentication responses. The reque +st() method will in fact invoke this method for each simple requ +est it sends.
    And if that's not enough, there's more that a search for redirect reveals in the manual page. There's also the redirect_ok callback.

    Wouldn't you agree that grepping manual pages is much faster than writing a post on Perlmonks and waiting for an answer?

Re: Get redirected URL
by Anonymous Monk on Aug 10, 2009 at 15:48 UTC
    $ua->add_handler("request_send", sub { shift->dump; return }); $ua->add_handler("response_done", sub { shift->dump; return });
Re: Get redirected URL
by vitoco (Hermit) on Aug 10, 2009 at 17:09 UTC

    I'm not sure if $ua->max_redirect(0) before the request will do that for you.

Re: Get redirected URL
by LTjake (Prior) on Aug 11, 2009 at 14:49 UTC

    c.f. requests_redirectable

    use strict; use warnings; use LWP::UserAgent; my $ua = LWP::UserAgent->new( requests_redirectable => [], ); my $res = $ua->get( shift ); print $res->status_line, "\n", 'Location: ', $res->header( 'location' +), "\n"; __END__ bricas@bricas-laptop:~$ perl red.pl http://bit.ly/gf4h1 301 Moved Location: http://github.com/rjbs/tpf-grant-history/blob/master/history +.txt

    --
    "No matter how much you push the envelope it'll still be stationary."

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://787356]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (5)
As of 2024-03-29 10:57 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found