in reply to Re^2: Zip file from WWW::Mechanize
in thread Zip file from WWW::Mechanize

The newer file is twice the size. What's the output of
od -c auto_20090318_0610.zip | head -n 4
and
od -c report.zip | head -n 4

Replies are listed 'Best First'.
Re^4: Zip file from WWW::Mechanize
by jck000 (Novice) on Mar 23, 2009 at 19:16 UTC
    jack3:/usr/genoais/aiq/aiq/bin$ od -c auto_20090318_0610.zip | head -n + 4 0000000 P K 003 004 024 \0 \0 \0 \b \0 a I s : S +F 0000020 215 200 C & 005 \0 345 233 * \0 026 \0 \0 \0 a +u 0000040 t o _ 2 0 0 9 0 3 1 8 _ 0 6 1 +0 0000060 . c s v 344 275 [ s 333 310 226 . 370 > 021 36 +3 jack3:/usr/genoais/aiq/aiq/bin$ od -c report.zip | head -n 4 0000000 P K 003 004 024 \0 \0 \0 \b \0 a I s : S +F 0000020 357 277 275 357 277 275 C & 005 \0 357 277 275 357 277 27 +5 0000040 * \0 026 \0 \0 \0 a u t o _ 2 0 0 9 +0 0000060 3 1 8 _ 0 6 1 0 . c s v 357 277 275 35 +7
      "\215" (and basically every other byte ≥128) are being replaced with "\357\277\275". "\357\277\275" is the UTF-8 encoding of \x{FFFD}, the replacement character used to represent bad data.
      $ perl -MEncode -e'printf "%04X\n", ord decode "UTF-8", "\357\277\275" +' FFFD

      It sounds like something tried to decode the zip file.

      $ perl -MEncode -we'print decode "UTF-8", "\215"' | od -c Wide character in print at -e line 1. 0000000 357 277 275 0000003

      While WWW::Mechanize's content calls decoded_content (defined in HTTP::Message), decoded_content shouldn't attempt to decode a zip file (only files with MIME type text/*).

      Is the web server incorrectly saying the .zip is a UTF-8 text file? Could you provide the output of the following:

      print $mech->response()->headers()->as_string();

      Delete "Set-Cookie:" headers and other authentication data before posting.

        Does this help? This is a Dumper($mech):
        'content-typ +e' => 'text/html; charset=utf-8', 'server' => ' +Microsoft-IIS/6.0', 'content-styl +e-type' => 'text/css', 'x-are' => 'y +ou digging my headers?', 'x-powered-b +y' => [ + 'http://www.bandwidth.com', + 'ASP.NET' + ], 'content-dis +position' => 'attachment;filename=auto_20090318_0610.zip', 'client-resp +onse-num' => 1, 'content-len +gth' => '337991', 'x-aspnet-ve +rsion' => '2.0.50727',
        I notice the content-type and content length.

        Jack