randomaccess3 has asked for the wisdom of the Perl Monks concerning the following question:

I posted this on StackOverflow, but the answer wasnt particularly helpful (as in it didn't solve my problem) The zip file I have has a non-ascii character in it. It can be downloaded here, and the file is in "test_data" https://github.com/log2timeline/dfvfs/archive/master.zip It looks like an i but it isnt. When I try unzip it with the perl code below it changes the i to an A-like character. The code is:
use Archive::Zip qw( :ERROR_CODES ); my $testsArchive = "master.zip"; my $testsDirectory = "master/"; my $zip = Archive::Zip->new(); die 'read error' unless ( $zip->read( $testsArchive ) == AZ_OK ); $zip->extractTree( '', $testsDirectory );
I'm using version 1.57 of the zip library, on win7, perl 5.22.1. If I run the same code on OS X it works file, so it has something to do with the charset encoding on Windows, but I'm at a loss how to fix it. The suggestion was to use "$Archive::Zip::UNICODE", however I haven't had any success with that. I've tried changing the codepage using a system call, but that didn't work either. Thanks in advance!

Replies are listed 'Best First'.
Re: Help with ZIP Encodings on MSWin
by hippo (Archbishop) on Jun 16, 2016 at 10:05 UTC
    I posted this on StackOverflow, but the answer wasnt particularly helpful

    Good to know. It would be even better if you were to link to this entry on StackOverflow so that we could read the answer received and thus avoid providing the same unhelpful answer here. Thanks.

      Sorry, I did include "The suggestion was to use "$Archive::Zip::UNICODE", however I haven't had any success with that." but I should have included the link as well. Thankfully someone else has posted it below.
Re: Help with ZIP Encodings
by Anonymous Monk on Jun 16, 2016 at 09:44 UTC
      Thanks, I'll check it out in more depth later. I'm having trouble installing Win32::Unicode via CPAN for some reason (both cpan Win32::Unicode, and manually downloading the repo). Will have to spend some more time on it. Cheers
Re: Help with ZIP Encodings
by Anonymous Monk on Jun 16, 2016 at 15:04 UTC
    Zips contain bytes, not characters. Files contain bytes, not characters. There's no need for encoding when going from one to another. Suspect that your OS X file viewer understands unicode and your windows one doesn't. If you're reading the unzipped file with perl, you may need to binmode $IN, ':utf8'
      Right. Where would I put that in the code posted above? Because as far as I can tell the library should be taking care of that. Unless you're suggesting that the problem is within the library?