I have a script that gets a file using HTTP with BasicAuth. Sometimes the GET fails with a code 500 even though the apache server logs show a code 200.

I thought it was an ISP proxy issue. I added code to re-try the GET request when is_success() is not true. After that I sometimes found that on a rare occasion even though LWP reported a code 200 on the download the actual contents of the file would be:
500 Can't connect to miniwall.foo.com:80 (connect: Invalid argument)
My next step was to create a separate file on the web server that contained the MD5 of the file I was trying to download. I updated the script to GET both files, calculate the MD5 of the file I am interested in and compare them. If the checksum does not match then retry both GET requests.

This worked in general. Sometimes the MD5 file contained the "500 ..." error message but that was okay because the MD5 checksum calculated from the data file would never match the error message in the MD5 file.

Still there are some times that the downloaded file is written to the disk containing the "500 ..." message.

I added use LWP::Debug qw(+); and started running the script and grepping the downloaded files between runs. After about 5 runs I got a case where the file contents contain an error. (Each run downloads 12 files + 12 MD5 files => 24 files).

The output for a file that was downloaded without error looks like:
LWP::UserAgent::request: () LWP::UserAgent::send_request: GET http://miniwall.foo.com/update-4.0/n +et4801/etc-files/nrpe.cfg LWP::UserAgent::_need_proxy: Not proxied LWP::Protocol::http::request: () LWP::Protocol::collect: read 624 bytes LWP::Protocol::collect: read 4096 bytes LWP::Protocol::collect: read 1539 bytes LWP::UserAgent::request: Simple response: OK

A download were an error occured looks like:
LWP::UserAgent::request: () LWP::UserAgent::send_request: GET http://miniwall.foo.com/update-4.0/n +et4801/scripts/ping-gw-by-int.pl LWP::UserAgent::_need_proxy: Not proxied LWP::Protocol::http::request: () LWP::UserAgent::request: Simple response: Internal Server Error

In the case of an error debugging printed:
LWP::UserAgent::request: Simple response: Internal Server Error
but is_success() returned true.

The part of the code that does the downloading and checks for errors is:
# GOAL : download a copy of the remote file my $file_url = $base_url . $remote_file; my $ua = LWP::UserAgent->new; my $req = GET $file_url; my $downloaded = 0; # 0 = not d/l , 1 = d/l my $tries = 4; while( (! $downloaded) and ($tries >= 0) ) { -- $tries; #print "tries: $tries\n"; $req->authorization_basic('mwuser', 'mwpass'); my $response = $ua->request($req); my $file_content = $ua->request($req)->content; # print $file_content; my $md5_is_good; check_file_md5($file_url, $file_content, \$md5_is_good); #print "md5_is_good: $md5_is_good\n"; if ($response->is_success and $md5_is_good) { $downloaded = 1; } else { my $msg = "$0: unable to get file. '$file_url'" . " '" . $response->code. "'" . " '" . status_message($response->code). "'" ; unless ($md5_is_good) { $msg .= " MD5 sum mismatch error"; } warn $msg; if ($tries < 0) { next CFG_KEY; # next file to download } else { print "re-trying $file_url\n"; } } } my $file_content = $ua->request($req)->content; # print $file_content;
Does this code fail to correctly check for a failed request?

I am certain that the check_file_md5() routine works (at least sometimes) because I have seen the "MD5 sum mismatch error" string output from time to time. Somehow there are times where the MD5 checksum matched and yet the "500 ..." string is written to the output file. OH! Why do I do this line twice?
my $file_content = $ua->request($req)->content;
I wonder if the different calls to content() are making different GET requests to the server. I will remove the second call so the MD5 is calculated on the same data written to the output file and see if the problem goes away.
By the way I am running the script with:
Perl 5.8.8
LWP::UserAgent version 2.033
OpenBSD 4.0

I ran ktrace and see that the connect() system call really is failing but I am not sure why.
30481 perl CALL connect(0x3,0x85b8bae0,0x10) 30481 perl RET connect -1 errno 22 Invalid argument

Update: I made two changes to the code. I replaced
my $response = $ua->request($req); my $file_content = $ua->request($req)->content;
with
my $response = $ua->request($req); my $file_content = $response->content;
I also removed the second call to ->content(). After several more runs of the script I have not yet noticed the error message in an output file.

In reply to Infrequent LWP::UserAgent 500 connect: Invalid argument by superfrink

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.