walto has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks, I need to zip a file containing the letter "ß". Archive::Zip can just handle unix Charset. The selected filed is compressed but the ß is replaced with a unreadable character. Does anybody know a solution ? After a deep search I found: http://www.bei-priess.de/computer/perl/winsonder.php. Although this page might be of interest for german speaking monks here a short translation:
Edit the following Registry Keys use the values in [] HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Nls\CodePage\OEMCP + [ = 1252] HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Nls\CodePage\OELHA +L [ = vga850.fon] HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\WOW\bo +ot\oemfonst.fon [ = vga850.fon]
After doing this the file straße.mdb was properly zipped in Windows XP.

Replies are listed 'Best First'.
Re: Archive::Zip and german charset
by borisz (Canon) on Feb 16, 2004 at 15:33 UTC
    I tested it and I have the special chars in the same way as expected readable. I think you uncompress the file on another machine or on a remote mounted filesystem where the charsets differ. IMHO this is not a zip problem.
    #!/usr/bin/perl use Archive::Zip qw( :ERROR_CODES :CONSTANTS ); my $zip = Archive::Zip->new(); my $member = $zip->addDirectory('test/'); $member = $zip->addFile('test/straße.txt'); $member->desiredCompressionMethod(COMPRESSION_DEFLATED); die 'write error' unless $zip->writeToFileNamed('someZip.zip') == AZ_O +K;
    Boris
Re: Archive::Zip and german charset
by iburrell (Chaplain) on Feb 16, 2004 at 20:00 UTC
    Does the file contains the special character? Or does the file name contain it? Archive::Zip should treat all files as binary content and not care about the encoding of the file.

    The file name is a different issue. My impression is that the ZIP format, Archive::Zip, and Unix file systems store file names as binary strings. They don't care about the encoding. However, the Perl could be translating the strings from native encoding to UTF-8. Also, some programs handling Unix file names assume they are UTF-8 (or a different native encoding). They will consider the German character as either invalid and translate it to something else.

Re: Archive::Zip and german charset
by walto (Pilgrim) on Feb 16, 2004 at 20:40 UTC
    borisz code is about what I already tried. The script runs on windows me so charset could be windows-1250? the name of the file which must be zipped is straße.mdb. Using winzip on the win me machine shows the proper name of the compressed file in the archive (archive filename= straße.zip) unzipping the archive on a win xp machine (where the archive is mailed to and using the unzip function of windows explorer) gives the right archive name but the deflated file contains a different character than ß. I already tried stra\xdfe.txt for the filename. But the result was the same. borisz might be right that it is not a problem with Archive::Zip. I tried extracting the file with winrar with the same results.
Re: Archive::Zip and german charset
by Vautrin (Hermit) on Feb 16, 2004 at 16:28 UTC
    Out of curiousity, what charachter encoding are you using? UTF-8? High ASCII? Something else?

    Want to support the EFF and FSF by buying cool stuff? Click here.
Re: Archive::Zip and german charset
by Anonymous Monk on Feb 16, 2004 at 16:58 UTC
    The selected filed is compressed but the ß is replaced with a unreadable character. Does anybody know a solution ?
    How? Where are you looking at the filename? Have you tried other zip software? What's the filename look there (wherever that may be -- console, gui filesystem browser, zip program, whatever)?

    You need to describe exactly what you did with what and where, cause like borisz says it doesn't look like an Archive::Zip problem

Re: Archive::Zip and german charset
by CloneArmyCommander (Friar) on Feb 18, 2004 at 16:49 UTC
    I had the same problem opening folders with spaces in the name through the prompt, like a folder named My Documents, so I replaced the problem character with a wildcard, and it opened the folder perfectly fine with My*Documents :). I don't know if it will work, but I hope my reply has been some help :).
      This is some foolishness of Windows XP explorer.
      a) My colleague zipped a file on his desktop with folder information. Opening on Windows 2000 with Power Archiver was no problem. Opening with explorer where XP displays archives like folders, clicking on "Dokumente und Einstellungen" (documents blah dunno what it's called in english) would make the explorer crash.
      b) Same when zipping straße.txt to straße.zip : XP explorer displays the filename inside the archive corrupted!
      Conclusions?
      1. Using a proper unzipping programm should get rid of those problems. There are freeware one that can be installed without administrator rights.
      2. Maybe someone should report the problem to Microsoft and then we all wait for the next Windows update ...
      It might work from the prompt but I could not open a file
      $file="stra.e\.txt"; $member = $zip->addFile("$file");
      Also using File::Find did not work out
      find(\&wanted, "c:/test"); sub wanted { $file=$File::Find::name; if (/stra.e\.txt/){ $file=$_; } }