bobross419 has asked for the wisdom of the Perl Monks concerning the following question:

Added spoilers so I don't break front page.

I'm working on a backup script for my department's triage list. Unfortunately, in order to pull the information I'm forced to use WWW::Mechanize::Firefox to go through the pages and pull down the information into .html files. Then I take all of the .html pages and put them into a .zip using Archive::Zip. Once this is done I'm using Mail::Outlook to send the .zip out to my group.

All seemed to be working fine for months until I suddenly received write errors when attempting to write to the .zip file. I started working on the code and uncovered a problem where the actual files aren't being put into the .zip, they are going in empty.

Here is the code I'm using with the email portion ripped out:

use strict; use warnings; use WWW::Mechanize::Firefox; use Archive::Zip qw( :ERROR_CODES :CONSTANTS ); my $input = "file.txt"; open (my $FI, "<", $input) or die "Unable to open $input: $!\n"; #The following will eventually be reworked as hash my @kbnums; my @schools; for <$FI> { m/(KB\d*)\s(.*)/i; push(@kbnums, $1); push(@schools, $2); } my $url = 'https://start.of.url/'; my $mech = WWW::Mechanize::Firefox->new(); my $zip = Archive::Zip->new(); #$checkLoad was added to verify the file write was complete my $checkLoad = "some text that appears at bottom of page"; my $count = 0; for (@kbnums) { my $file = "$schools[$count].html"; my $addr = "$url$_"; my $attempts = 0; do { $mech->get($addr, synchronize=> 0, ':content_file' => $file); $attempts++; }while (!(grep($checkLoad, $file)) && ($attempts < 10) ); #Added this to verify the $file has the $checkLoad written grep($checkLoad, $file) ? print "$file - SUCCESS" : print "$file - + FAIL"; $zip->addFile($file); $count++; } my $status = $zip->overwriteAs('EOC-triage.zip'); die "Unable to write Zip - $status : $!\n" if $status != AZ_OK;

When I run the above code, one of three things will happen:

1) All of the HTML files are written correctly, the Zip file is created containing all of the files, but all of the files inside the zip are 0 size containing no data.

2) All of the HTML files are written correctly, the Zip file is created containing all of the files with their correct data.

3) I receive the following error:

IO error: reading data : at C:\strawberry\perl\vendor\lib/Archive/Zip/NewFileMember.pm line 60 Archive::Zip::NewFileMember::_readRawChunk('Archive::Zip::NewFileMembe +r=HASH(0x1af508c)', 'SCALAR(0x12db3a4)', 13787) called at C:\strawber +ry\perl\vendor\lib/Archive/Zip/Member.pm line 788 Archive::Zip::Member::readChunk('Archive::Zip::NewFileMember=HASH(0x1a +f508c)', 32768) called at C:\strawberry\perl\vendor\lib/Archive/Zip/M +ember.pm line 1063 Archive::Zip::Member::_writeData('Archive::Zip::NewFileMember=HASH(0x1 +af508c)', 'IO::File=GLOB(0x17d4484)') called at C:\strawberry\perl\ve +ndor\lib/Archive/Zip/Member.pm line 1030 Archive::Zip::Member::_writeToFileHandle('Archive::Zip::NewFileMember= +HASH(0x1af508c)', 'IO::File=GLOB(0x17d4484)', 1, 181272) called at C: +\strawberry\perl\vendor\lib/Archive/Zip/Archive.pm line 402 Archive::Zip::Archive::writeToFileHandle('Archive::Zip::Archive=HASH(0 +x1afc8c4)', 'IO::File=GLOB(0x17d4484)') called at C:\strawberry\perl\ +vendor\lib/Archive/Zip/Archive.pm line 438 Archive::Zip::Archive::overwriteAs('Archive::Zip::Archive=HASH(0x1afc8 +c4)', 'EOC-triage.zip') called at gatherTriage2.pl line 51 Can't write to C:\DOCUME~1\user\LOCALS~1\Temp\prAo0rBZe7.zip at gatherTriage2.pl line 51 Uncaught exception from user code: Unable to Write Zip - 4 : at gatherTriage2.pl line 52

From what I can determine, situation 1 occurs when I delete all of the old HTML and Zip files then try from scratch. Situations 2 and 3 seemingly occur at random whenever previously created HTML and Zip files are present.

Oddly, when I create a standalone script to go through all of the files using just Archive::Zip I don't seem to have this problem.

use strict; use warnings; use Archive::Zip qw( :ERROR_CODES :CONSTANTS); my @files = <*>; my $arch = Archive::Zip->new(); my $file = 'test.zip'; $arch->addFile($_) for (@files); my $status = $arch->overwriteAs($file);

I tried duplicating this in the original script without success.

I have a feeling that there is something simple that I'm overlooking, but I've gone through the code and through the Archive::Zip CPAN page quite a few times now without success. My other idea is that the html files aren't being written completely, but the grep seems to indicate otherwise.

Any help is appreciated.

Replies are listed 'Best First'.
Re: WWW::Mechanize::Firefox and Archive::Zip Issue
by bobross419 (Acolyte) on Nov 07, 2011 at 02:45 UTC
    Apparently the synchronize => 0 was causing the problems. I rewrote the script from scratch and everything works fine. I had originally added in the synchronize => 0 because nothing was happening... Of course something was happening, just very slowly.