MintyFresh has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to pull URLs from a MySQL database and check them to make sure they are still valid. Here's what I am currently using
..... my($query) = "SELECT weburl, webdate FROM table WHERE approval = ' +1' LIMIT 10"; my($sth) = $dbh->prepare($query); $sth->execute || die("Could not execute!"); while(@row = $sth->fetchrow_array) { $whereto = "$row[0]"; &letsgo; } exit; sub letsgo { $ua->agent("Mozilla/8.0"); $req = new HTTP::Request 'GET' => '$whereto'; $res = $ua->request($req); if ($res->is_success) { print "$whereto WORKS\n"; } else { print "$whereto doesnt work\n"; } }
That shows every URL as being dead even if it's not. If I change
$req = new HTTP::Request 'GET' => '$whereto';
to something like
$req = new HTTP::Request 'GET' => 'http://www.google.com';
It works fine. What am I missing when getting URLs from the database?

Replies are listed 'Best First'.
Re: Checking URLs with LWP::UserAgent
by Fastolfe (Vicar) on Dec 04, 2001 at 03:01 UTC

    Your problem is that you're single-quoting '$whereto' which tells Perl to literally use $whereto which is not a valid URL. Put it in "double quotes".

    Also, if all you're interested in is whether or not the URL is "OK" (i.e. giving a 200-series response), consider using HEAD instead of GET, and verify the response is "good" with is_success like you're doing. HEAD will return the same HTTP response but without retrieving content. Properly written CGI scripts, for instance, can avoid having to do a lot of unnecessary work with HEAD requests if all you're interested in is a success/failure. At a minimum you're saving bandwidth.

    It might do you good to print out the $res response, so you know why it failed.

(crazyinsomniac) Re: Checking URLs with LWP::UserAgent
by crazyinsomniac (Prior) on Dec 04, 2001 at 03:33 UTC
    Fastolfe got you covered for the most part, except, I say, unless you're adding more stuff, there is no need for the quotes at all, meaning $whereto is the same as "$whereto", and unless you're saying "http://$whereto" or some such thing, there is no need for the quotes.

    I'd like to know where you got the idea that you needed quotes?

    Also, Fastolfe is right that you should use HEAD requests, but apparently a few servers return broken HEAD replies, to which there is a workaround at LWP head replacement.

     
    ___crazyinsomniac_______________________________________
    Disclaimer: Don't blame. It came from inside the void

    perl -e "$q=$_;map({chr unpack qq;H*;,$_}split(q;;,q*H*));print;$q/$q;"