ezekiel has asked for the wisdom of the Perl Monks concerning the following question:

Hello all

For some time I have made use of Net::FTP in automating the download of files from FTP repositories. Recently, however, I have started to need to download a file from a HTTP source. After looking around, it seemed LWP::UserAgent was an appropriate tool and so:

my $URL = "http://target.site.com/path/to/file.gz"; my $ua = LWP::UserAgent->new(); my $response = $ua->get("$URL"); if ($response->is_success) { print $response->content; } else { die $response->status_line; }

Now, when I run it I get at server error 500. Yet if I browse to the URL with Firefox I have no problem accessing the file??

What am I doing wrong? and is LWP::UserAgent the best tool for this type of job?

Thanks

Replies are listed 'Best First'.
Re: Help with LWP::UserAgent server error 500
by tlm (Prior) on Jun 29, 2005 at 05:44 UTC

    To elaborate on PodMaster's reply, change the second line of your script to something like this

    my $ua = LWP::UserAgent->new( agent => 'User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.6) Gecko/2 +0050223 Firefox/1.0.1' );
    The argument to the agent option determines the type of HTTP client that $ua will represent itself as. For the sake of illustration, I used the string corresponding to my version of Firefox. The exact version of Firefox is probably not important, but if you want to match it exactly to the one for your browser, or more generally, if you want to know how to best replicate what your Firefox browser is telling the server, get yourself the LiveHTTPHeaders extension for Firefox. This extension pops up a window showing exactly the conversation between your browser and the server, which you can then replicate with your script.

    the lowliest monk

      Thanks for all the offers of help. Starting with the thoughts in this response I have made some progress. I now have:

      my $URL = "http://target.site.com/path/to/file.gz"; my $ua = LWP::UserAgent->new( agent => 'Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.8) Gecko/20050 +511 Firefox/1.0.4' ); $ua->proxy(['http'], 'http://my.proxy.server/proxy.pac'); my $response = $ua->get("$URL"); if ($response->is_success) { print $response->content; } else { die $response->status_line; }

      I put in the proxy server, based on the automatic proxy configuration URL I use in FireFox for HTTP. Once I did this, the 500 error went away to be replaced by a 404 Not Found error!

      I check the URL, by cut and paste into browser, so that is OK. And I also changed the URL to a different file on an FTP site and that works OK. So the problem appears to be HTTP specific. Any further thoughts? And am I using that proxy call correctly?

      Thanks

Re: Help with LWP::UserAgent server error 500
by CountZero (Bishop) on Jun 29, 2005 at 05:50 UTC
    Definitely something wrong with the server as the '500' error is a "server encountered an unexpected condition which prevented it from fulfilling the request" error.

    If it was something wrong with your client, you should have gotten a '4xx' type of error.

    Check the headers which Firefox sends and receives (use the extension "Live HTTP headers") and compare that with the headers your script sends and receives and see what the difference is. It may be that the server disallows 'robot'-type of access, but the response should then be a '403' error ("forbidden").

    CountZero

    "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

Re: Help with LWP::UserAgent server error 500
by PodMaster (Abbot) on Jun 29, 2005 at 05:28 UTC
    Now, when I run it I get at server error 500. Yet if I browse to the URL with Firefox I have no problem accessing the file??

    What am I doing wrong?

    You're talking to a broken server. If it works with Firefox, pretend to be firefox (send similar user agent and whatever else headers firefox sends). If that doesn't work, contact the webmaster.

    MJD says "you can't just make shit up and expect the computer to know what you mean, retardo!"
    I run a Win32 PPM repository for perl 5.6.x and 5.8.x -- I take requests (README).
    ** The third rule of perl club is a statement of fact: pod is sexy.

      This isn't strictly true. LWP returns a status code of 500 itself to indicate that something didn't happen as planned on the client side. I have seen this issue most frequently when trying to access a site without setting the required proxy envrironment variable that I need in my company.
      C:\>get http://perlmonks.com C:\>500 Can't connect to www.perlmonks.com:80 (connect: Unknown error)

      If the OP had missed a required proxy setting step then this would explain why they could access the file using a browser (presumably set up to use a proxy) but not the script.

      As a general point, it may be useful to use LWP::Debug qw(+); to trace the progress of an LWP transaction.

Re: Help with LWP::UserAgent server error 500
by Anonymous Monk on Jun 29, 2005 at 15:07 UTC
    When you install LWP, you should find GET, POST, and HEAD in your perl/bin directory. Might be batch files if on Win32. GET has an option to view the response chain of headers. So on the command line, you could type

       GET -s -e http://target.site.com/path/to/file.gz

    and you should be able to see what comes back. Also, as that's obviously a dummy string, have you checked the spelling to make sure you're not requesting an invalid path or file?

    Cheers