jared.collier has asked for the wisdom of the Perl Monks concerning the following question:

I am writing a perl script to traverse a directory, and for each .zip file it finds, will break them in to smaller .zip files, maintaining the order of the documents. I can write the code to traverse the directory, if anyone has insight into how I can pull files out of an archived file and, maintaining the order, zip them up again with a maximum file size of 50MB I would appreciate a response. Once I have something to go off of, I'll start posting my code as I work on it. Thanks, Jared

Replies are listed 'Best First'.
Re: Breaking up ZIP files
by Sinistral (Monsignor) on Oct 06, 2009 at 20:57 UTC

    You might want to not bother writing your own code, but to use Perl's ability to invoke other programs. In this case, the tool you're looking for is zipsplit. If you're on *NIX (or Mac OSX, I'm guessing), it should have come along with the zip program. If you're on Windows, then you can get it from the GnuWin32 project

Re: Breaking up ZIP files
by DStaal (Chaplain) on Oct 06, 2009 at 19:29 UTC

    On Unix, I'd just feed the zip files through split... (You'll get files spread between multiple splits, of course. This may or may not be a problem for you.)

      I think that more of a problem to the OP would be to run a zip file thro' split - which would have the effect of destroying the file to produce a number of individually irrecoverable, same-size chunks.

      Of more help would, IMO, be to utilise Archive::Zip...

      A user level that continues to overstate my experience :-))
        This is my code so far, I'm continuing to work on it, and will post as I make major steps forward:
        #!/usr/bin/perl use strict; use diagnostics; use File::Path; use Archive::Zip qw( :ERROR_CODES :CONSTANTS ); ############################################ # Prints the usage file if requested or if # # called incorrectly # ############################################ if (@ARGV != 1 || $ARGV[0] eq '--help'|| $ARGV[0] eq '-help' || $ARGV[0] eq '?') { &printUsage; #Ommited for ease of review } # These variables store the locations of # the directory containing the zip # Files to be tested, as well as the # location of the processed and error # files. my $basePath = "C:/DocumentQC/Document Manipulation Area/Zip Test"; my $inputDirectoryName = "Input"; my $outputDirectoryName = "Output"; my $inputDirectory = "$basePath/$inputDirectoryName/"; my $outputDirectory = "$basePath/$outputDirectoryName/"; # Checks to see if all necessary directories # exist, and creates them if they don't. my $outputDirectoryExists = 0; opendir(PARENT_DIRECTORY, $basePath) || die ("Cannot open directory, m +ake sure you've edited the $basePath variable in breakUpZipFiles1_0_0 +.pl."); my @parentDirectoryArray = readdir(PARENT_DIRECTORY); closedir(PARENT_DIRECTORY); for my $i (0 .. $#parentDirectoryArray) { if ($parentDirectoryArray[$i] eq $outputDirectoryName) { $outputDirectoryExists = 1; } } if ($outputDirectoryExists == 0) { mkdir("$basePath/$outputDirectoryName", 0755); } # Opens the input directory and reads all files # to the @input array. opendir(INPUT_DIRECTORY, $inputDirectory) || die("Cannot open input di +rectory $inputDirectory - $!"); my @input = readdir(INPUT_DIRECTORY); closedir(INPUT_DIRECTORY); # Do the nasty! #YOU MAY NEED Archive::Zip::setChunkSize( 4096 ); TO DEAL WITH THESE L +ARGE FILES my $fileIterator = 1; my $zeroFill = "000"; #change later to fill properly MAIN_LOOP: for (my $i = 0; $i < @input; $i++) { #-------------------------------# # Ignore if it isn't a zip file # #-------------------------------# if ($input[$i] !~ /^(.*)\.zip$/) { print "\nSKIP FILE: \t\t$input[$i]"; next MAIN_LOOP; } print "\nARCHIVE ENCOUNTERED: \t$input[$i]"; my $baseFileName = $1; #--------------------------------------------------# # Create a zip object for the file to break apart, # # and a zip object for your smaller archive # #--------------------------------------------------# my $inputZipFile = Archive::Zip->new(); unless ( $inputZipFile->read("$inputDirectory$input[$i]") == AZ_OK) +{ die "Error reading zip file: $input[$i]"; } my $outputZipFile = Archive::Zip->new(); #------------------------------------------------------# # Create a list of the files in the current zip object # #------------------------------------------------------# #---------------------------------------------------# # Add files from the old zip object sequentially to # # the new zip object, delete it from the old one # #---------------------------------------------------# # USE Archive::Zip::MemberRead (Page 4) #-----------------------------------------------------# # Iterate until there are 250 files in the new object # #-----------------------------------------------------# #---------------------------------------------------------# # Zip the new object and place it in the pickup directory # #---------------------------------------------------------# unless ( $outputZipFile->writeToFileNamed("$outputDirectory$baseFile +Name\_$zeroFill$fileIterator\.zip") == AZ_OK) { die "Error reading zip file: $input[$i]"; } #------------------------------------------------------------------# # If there are no more files in the old zip directory, destroy it # #------------------------------------------------------------------# }