memwaster has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks

I'm working on a script that crawls a website and reports on any pages that return a 401 status code. The aim is to identify any pages that accept Basic Authentication over plain http (and therefore passwords sent in clear text). Below is a simplified version with the relevant code:

use strict; use LWP::UserAgent; my $ua = LWP::UserAgent->new; my $uri = shift @ARGV; my $headreq = HTTP::Request->new(HEAD => $uri); my $headres = $ua->request($headreq); my $statuscode = $headres->code(); print "Status 401 at $uri\n" if $statuscode == 401;
This works fine until we apply the solution, which is for the server to redirect us to the same address starting https://. If I visit one of these 'fixed' pages with a browser and look at the headers, I see a request for http://page followed by a 302 response which redirects to https://page and then a 401 response. My perl script, however, tells me "Status 401 at http://page", which for my purposes is a false positive. Can anyone think of a better way to do this?

cheers

memwaster

Replies are listed 'Best First'.
Re: HTTP status codes
by ikegami (Patriarch) on Oct 29, 2007 at 17:29 UTC

    Do you want it to be silent for the redirection? Use a simple_request instead of request.

    my $ua = LWP::UserAgent->new; my $uri = shift @ARGV; my $headreq = HTTP::Request->new(HEAD => $uri); my $headres = $ua->simple_request($headreq); # <---- my $statuscode = $headres->code(); print "Status 401 at $uri\n" if $statuscode == 401;

    Or do you wish for it to display the uri to which you were redirected? Access it through the Request that produced the Response.

    my $ua = LWP::UserAgent->new; my $uri = shift @ARGV; my $headreq = HTTP::Request->new(HEAD => $uri); my $headres = $ua->request($headreq); my $statuscode = $headres->code(); if ($statuscode == 401) { my $final_uri = $headres->request()->uri(); # <---- print "Status 401 at $final_uri\n" }
      Thank you both for your answers. I think I will use ikegami's second option to display the uri to which I am redirected and filter out any https:// results later. If I ignore redirections completely there might be some false negatives.

      cheers

      memwaster

Re: HTTP status codes
by moritz (Cardinal) on Oct 29, 2007 at 17:22 UTC
    You can set the max_redirect property in the LWP::UserAgent constructor to 0, then the 302 will be reported as status code.