Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I am using LWP behind firewall to test some web links working or not. Some of the links return to be OK but some of them which are acctually exists give me the "404 Not Found" errors. Does anyone know what's the problem? Following is the code.

use strict; use LWP::UserAgent; use HTTP::Request; use HTTP::Response; #use LWP::Debug qw(level); level('+'); my ($request, $response, $ua, $status_line, $url); print "Content-type: text/html\n\n"; $ua = LWP::UserAgent->new; $ua = new LWP::UserAgent; $ua->agent("Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Q312461 +; YComp 5.0.0.0)"); $ua->env_proxy; $ua->proxy('http', 'http://proxy.com/'); $ua->timeout(10); $url ='http://www.netscape.com'; # return "200 OK" #$url ='http://wp.netscape.com/fun/index.html'; # return "404 Not Foun +d" $request = new HTTP::Request('GET', $url); $request->header('Accept' => 'text/html'); $response = $ua->request($request); $status_line = $response->status_line; print "$status_line\n";

Thanks,

2003-04-16 edit ybiC: retitled from "LWP Problem"

Replies are listed 'Best First'.
Re: LWP Problem
by dws (Chancellor) on Apr 16, 2003 at 21:50 UTC
    Some of the links return to be OK but some of them which are acctually exists give me the "404 Not Found" errors.

    That second Netscape URL returns a 404 response when I hit it through a browser, and a 404 response when I hit it through LWP. The responses are different, which could be attributable to one of several things:

    • The site is looking for a cookie.
    • The site is decoding the agent string.
    • The site is making a decision based on some other part of the HTTP request header.

    Without a lot of experimenting, it's hard to tell which.

    Update: You've correct the second URL. That one does work for me. You might be having an issue with your proxy server.

      I got all correct returns if I run the script on my local machine using ActivePerl. I am wondering if it is a firewall issue.
Re: LWP Problem
by crenz (Priest) on Apr 16, 2003 at 21:47 UTC

    Maybe I'm missing something, but http://wp.netscape.com/fun/index.html truly doesn't seem to exist to me ("Page Not Found!"). So the module is giving you the correct answer.

      Sorry I passed the wrong url. It actually is 'http://www.netscape.com/fun/'.
Re: LWP::UserAgent, HTTP proxy, 404 Not Found
by hossman (Prior) on Apr 17, 2003 at 05:29 UTC

    I switched your code to use a free proxy i found in an online list and had no problem.

    You can see the exact test case below...

      Where is the proxy config file on unix?