tone has asked for the wisdom of the Perl Monks concerning the following question:

Hi there fellow Monks,

I have a bit of an odd problem with uncompressing a .gz file. If I uncompress an Office file that has been gzipped then the Office file becomes corrupt. This is true with the old office format (doc,xls) and the new format (docx,xlsx).

If however I uncompress any other file (eg: avi,txt,bmp) that has been gzipped then there are no problems. I've tried using Archive::Extract and IO::Uncompress::Gunzip and I get the same results with both.

Here is the code I'm using for Archive::Extract:
use strict; use warnings; use Archive::Extract; my $file = "Testing.xls.gz"; my $ae = Archive::Extract->new(archive => $file); unless(my $ok = $ae->extract){ print "There was an error: $ae->error"; }
Here is the code I'm using for IO::Uncompress::Gunzip:
use strict; use warnings; use IO::Uncompress::Gunzip qw(gunzip $GunzipError); my $file = "Testing.xls.gz"; my $out = "Testing.xls"; my $status = gunzip $file => $out or die "gunzip failed: $GunzipError\ +n";
Any help with this will be greatly appreciated.

Replies are listed 'Best First'.
Re: Office file becomes corrupt once extracted from gz
by MidLifeXis (Monsignor) on Oct 22, 2008 at 15:18 UTC

    Is there a binmode flag for either of those unzip modules? Also, is there the possibility that the files were corrupted on compession instead (again, see binmode)?

    --MidLifeXis

      I can't believe I didn't think of binmode. I searched the docs and I found that IO::Uncompress::Gunzip has a binmode so I enabled that and now the files are getting extracted perfectly.

      Thanks for your help.

Re: Office file becomes corrupt once extracted from gz
by kyle (Abbot) on Oct 22, 2008 at 15:24 UTC

    Are these files also corrupt when uncompressed with the standard command line tools (or the tools used to compress them) rather than with the Perl modules you're using?

      If I uncompress the files with the utility 7-Zip there are no problems with the files.
        ptar would've handled it just as well :)