Rumtis has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to use perl (ActivePerl 5.8.7 Build 813) to download a gzip file from a web server using LWP. Code:
use LWP::UserAgent; use LWP::Debug qw(level); level('+'); my $url = 'https://www.myurl.com/cgi-bin/file-download'; $ENV{HTTPS_PROXY} = 'PROXY'; $ENV{HTTPS_PROXY_USERNAME} = 'PROXY_USERNAME'; $ENV{HTTPS_PROXY_PASSWORD} = 'PROXY_PASSWORD'; my $ua = LWP::UserAgent->new; $ua->proxy(['http','https']); my $response = $ua->post( $url, Content_Type => 'form-data', Content => [ userid => 'USERID', password => 'PASSWORD', recordlength => '', recordterminator => 'NONE', remotefilename => 'REMOTE_FILENAME' ] ); if ($response->is_success) { print $response->content; } else { die $response->status_line; }

The POST to https://www.myurl.com/cgi-bin/file-download returns Content-Type: application/octet-stream and Content-Disposition: inline (basically a binary stream of inline data that is ultimately a GZIP file).

IE normally prompts you for a file name to save the file as and saves the file properly. However my $response->content has Carriage return/Line feeds throughout the binary stream returned. It does not always put a CrLf after 'x' numnber of characters; it seems to be randomly placed.

Now, I'm only about 95% convinced that the site isn't sending me the CrLf in the stream, but if I go to the site and click on the button that does the post from the "download" page it returns just fine.

The site's download page isn't too complex, so I'm quite sure I'm not missing a field I have to send in the post.

Any assistance of where to go from here would be appreciated.

Replies are listed 'Best First'.
Re: CrLf getting inserted into an octet-stream (gzip file)
by Corion (Patriarch) on May 23, 2008 at 13:31 UTC

    You want to use binmode when writing binary content.

    You could also let LWP::UserAgent store the response content to disk directly, depending on your needs:

    $ua->post( ':content_file' => $filename, ... );

    or, if this is an exercise to send the file to a client, give quicker feedback through a callback (but here you still have to binmode STDOUT):

    $ua->post( ':content_callback' => sub { print $_[0] }, ... );
      Thanks Corion
      Setting binmode would do it, but I'm trying to write some cleaner code and use the content_file parameter
      I've changed my post call to
      my $response = $ua->post( $url, ':content_file' => '.\test.gz', Content_Type => 'form-data', Content => [ userid => 'USERID', password => 'PASSWORD', recordlength => '', recordterminator => 'NONE', remotefilename => 'REMOTE_FILENAME' ] );
      However, I'm not getting the test.gz file written out. I figure I'm doing (or not doing) something stupid, but I'm not seeing it.
      So I'm apprealing to the monks again to help me past my stupidity.
      Thanks for you help, though.

        Most likely, the current directory is not what you think it is, or the user your program is running as has not sufficient rights to write to the file.

        Consider using an absolute path to save the file. That way you become independent of what the current directory is.

        As an aside, you might want to be careful when using backslashes. In most cases, you can use forward slashes even on Windows, and forward slashes don't have the danger of getting interpreted as escapes:

        ':content_file' => 'c:/temp/test.gz',
Re: CrLf getting inserted into an octet-stream (gzip file)
by moritz (Cardinal) on May 23, 2008 at 13:28 UTC
    Try this before sending your output: binmode STDOUT;