in reply to Re^4: Zip file from WWW::Mechanize
in thread Zip file from WWW::Mechanize

"\215" (and basically every other byte ≥128) are being replaced with "\357\277\275". "\357\277\275" is the UTF-8 encoding of \x{FFFD}, the replacement character used to represent bad data.
$ perl -MEncode -e'printf "%04X\n", ord decode "UTF-8", "\357\277\275" +' FFFD

It sounds like something tried to decode the zip file.

$ perl -MEncode -we'print decode "UTF-8", "\215"' | od -c Wide character in print at -e line 1. 0000000 357 277 275 0000003

While WWW::Mechanize's content calls decoded_content (defined in HTTP::Message), decoded_content shouldn't attempt to decode a zip file (only files with MIME type text/*).

Is the web server incorrectly saying the .zip is a UTF-8 text file? Could you provide the output of the following:

print $mech->response()->headers()->as_string();

Delete "Set-Cookie:" headers and other authentication data before posting.

Replies are listed 'Best First'.
Re^6: Zip file from WWW::Mechanize
by jck000 (Novice) on Mar 23, 2009 at 21:46 UTC
    Does this help? This is a Dumper($mech):
    'content-typ +e' => 'text/html; charset=utf-8', 'server' => ' +Microsoft-IIS/6.0', 'content-styl +e-type' => 'text/css', 'x-are' => 'y +ou digging my headers?', 'x-powered-b +y' => [ + 'http://www.bandwidth.com', + 'ASP.NET' + ], 'content-dis +position' => 'attachment;filename=auto_20090318_0610.zip', 'client-resp +onse-num' => 1, 'content-len +gth' => '337991', 'x-aspnet-ve +rsion' => '2.0.50727',
    I notice the content-type and content length.

    Jack

      That confirms that the server is giving you garbage. It's saying that the zip file is really an UTF-8 HTML document.

      'content-type' => 'text/html; charset=utf-8', 'content-disposition' => 'attachment;filename=auto_20090318_0610.zip',

      The solution is to fix the response received from the web server.

      BEGIN { my $old_make_request = WWW::Mechanize->can('_make_request'); no warnings 'redefine'; *WWW::Mechanize::_make_request = sub { my $response = $old_make_request->(@_); my $type = $response->header('Content-Type'); my $dispo = $response->header('Content-Disposition'); $response->header('Content-Type' => 'application/zip') if defined($dispo) && $dispo =~ m{\.zip$} && defined($type) && $type =~ m{^text/}; return $response; }; }

      Untested.

        Thanks!!!!!

        That worked with no change on my preliminary test.

        I never would have figured that out since it worked properly on FF.

        Jack