msalerno has asked for the wisdom of the Perl Monks concerning the following question:

I need the ability to zip a bunch of files using some logic for the archiving. However, due to the fact that these files need to be zipped prior to backing up, I need to validate the files. The plan is to zip the files then loop through all files in the zip and validate the CRC. The problem I am having is that I cannot get the CRC's to match. I am using an example from here. I'm using the following sub for validation:
use Archive::Zip qw( :ERROR_CODES :CONSTANTS ); use File::Path 'mkpath'; use IO::File; ## # Zipfile validation ## my $zipvalidate = Archive::Zip->new(); if ( $zipvalidate->read( $archivedir.$filename ) != 0 ) { print STDERR "ERROR Opening $archivedir$filename\n"; next; } foreach my $member ($zipvalidate->members()){ my $fh = IO::File->new_tmpfile or print "Unable to make ne +w temp file: $!"; $member->extractToFileHandle($fh); if ($member->extractToFileHandle($fh) != 0){ print "Error in $archivedir$filename\n"; next; } seek($fh, 0, 0); binmode($fh); my $buffer; my $bytesRead; my $crc = 0; while ( $bytesRead = $fh->read( $buffer, 32768 ) ) { $crc = Archive::Zip::computeCRC32( $buffer, $crc ); } printf( "\nFrom CALC: %08x", $crc ); printf( "\tFrom ZIP: %08x", $member->crc32() ); print "\t"; print $member->fileName(); print "\n"; #undef $fh; } }
Any assistance would be greatly appreciated. Thanks

Replies are listed 'Best First'.
Re: Zip file CRC validation
by graff (Chancellor) on Jul 06, 2009 at 23:24 UTC
    I tried it like this, and on the first zip file I happened to look at, it worked just fine -- the computed crcs all matched the ones stated in the zip file itself:
    #!/usr/bin/perl use strict; use IO::File; use Archive::Zip qw/:ERROR_CODES :CONSTANTS/; my $Usage = "Usage: $0 file.zip\n"; die $Usage unless ( @ARGV == 1 and -f $ARGV[0] ); my $zip = Archive::Zip->new(); if ( $zip->read( $ARGV[0] ) != AZ_OK ) { die "Archive::Zip failed to read $ARGV[0]\n"; } for my $zfile ( $zip->members ) { my $fh = IO::File->new_tmpfile or die "Unable to create temp file: $!\n"; $fh->binmode; if ( $zfile->extractToFileHandle( $fh ) != AZ_OK ) { warn sprintf( "Extraction failed for %s\n", $zfile->fileName() + ); next; } seek( $fh, 0, 0 ); my ( $buffer, $nread ); my $crc = 0; while ( $nread = $fh->read( $buffer, 32768 )) { $crc = Archive::Zip::computeCRC32( $buffer, $crc ); } my $status = ( $crc == $zfile->crc32()) ? "good" : "broke"; printf( "Status: %s\tcomp crc: %08x\tfile crc: %08x\t%s\n", $status, $crc, $zfile->crc32(), $zfile->fileName()); }

    UPDATE: I wonder if the problem in the OP code might be here:

    $member->extractToFileHandle($fh); if ($member->extractToFileHandle($fh) != 0){ print "Error in $archivedir$filename\n"; next; }
    It looks like you are extracting the file twice to the same file handle, and maybe that is causing each file to have its contents doubled (and you would almost never get the same crc in that case). In fact, I just added a second "extractToFileHandle" call in my version, and all the data files failed the crc check.
      Yikes!! You are correct sir. I can't believe that I overlooked such a simple mistake. Thanks for pointing it out. I was so damn focused on the crc check that didn't check myself. Thanks
      # Zipfile validation ## my $zipvalidate = Archive::Zip->new(); if ( $zipvalidate->read( $archivedir.$filename ) != 0 ) { print STDERR "ERROR Opening $archivedir$filename\n"; next; } foreach my $member ($zipvalidate->members()){ my $fh = IO::File->new_tmpfile or print "Unable to make ne +w temp file: $!"; binmode($fh); if ($member->extractToFileHandle($fh) != 0){ print "Error in $archivedir$filename\n"; next; } seek($fh, 0, 0); my $buffer; my $bytesRead; my $crc = 0; while ( $bytesRead = $fh->read( $buffer, 32768 ) ) { $crc = Archive::Zip::computeCRC32( $buffer, $crc ); } printf( "\nFrom CALC: %08x", $crc ); printf( "\tFrom ZIP: %08x", $member->crc32() ); print "\t"; print $member->fileName(); print "\n"; undef $fh; } }
Re: Zip file CRC validation
by jwkrahn (Abbot) on Jul 06, 2009 at 18:58 UTC
    my $fh = IO::File->new_tmpfile or print "Unable to make ne +w temp file: $!"; $member->extractToFileHandle($fh); if ($member->extractToFileHandle($fh) != 0){ print "Error in $archivedir$filename\n"; next; } seek($fh, 0, 0); binmode($fh);

    You should binmode the filehandle just after you create it.

      I moved the binmode and it made no difference.
      my $fh = IO::File->new_tmpfile or print "Unable to make ne +w temp file: $!"; binmode($fh); $member->extractToFileHandle($fh);
      my $fh = IO::File->new_tmpfile or print "Unable to make ne +w temp file: $!"; $member->extractToFileHandle($fh); binmode($fh);
      Output: From CALC: 2b35e7ab From ZIP: 9bdd56f6 test.log Thanks for the suggestion.