bobross419 has asked for the wisdom of the Perl Monks concerning the following question:
Added spoilers so I don't break front page.
I'm working on a backup script for my department's triage list. Unfortunately, in order to pull the information I'm forced to use WWW::Mechanize::Firefox to go through the pages and pull down the information into .html files. Then I take all of the .html pages and put them into a .zip using Archive::Zip. Once this is done I'm using Mail::Outlook to send the .zip out to my group.
All seemed to be working fine for months until I suddenly received write errors when attempting to write to the .zip file. I started working on the code and uncovered a problem where the actual files aren't being put into the .zip, they are going in empty.
Here is the code I'm using with the email portion ripped out:
use strict; use warnings; use WWW::Mechanize::Firefox; use Archive::Zip qw( :ERROR_CODES :CONSTANTS ); my $input = "file.txt"; open (my $FI, "<", $input) or die "Unable to open $input: $!\n"; #The following will eventually be reworked as hash my @kbnums; my @schools; for <$FI> { m/(KB\d*)\s(.*)/i; push(@kbnums, $1); push(@schools, $2); } my $url = 'https://start.of.url/'; my $mech = WWW::Mechanize::Firefox->new(); my $zip = Archive::Zip->new(); #$checkLoad was added to verify the file write was complete my $checkLoad = "some text that appears at bottom of page"; my $count = 0; for (@kbnums) { my $file = "$schools[$count].html"; my $addr = "$url$_"; my $attempts = 0; do { $mech->get($addr, synchronize=> 0, ':content_file' => $file); $attempts++; }while (!(grep($checkLoad, $file)) && ($attempts < 10) ); #Added this to verify the $file has the $checkLoad written grep($checkLoad, $file) ? print "$file - SUCCESS" : print "$file - + FAIL"; $zip->addFile($file); $count++; } my $status = $zip->overwriteAs('EOC-triage.zip'); die "Unable to write Zip - $status : $!\n" if $status != AZ_OK;
When I run the above code, one of three things will happen:
1) All of the HTML files are written correctly, the Zip file is created containing all of the files, but all of the files inside the zip are 0 size containing no data.
2) All of the HTML files are written correctly, the Zip file is created containing all of the files with their correct data.
3) I receive the following error:
IO error: reading data : at C:\strawberry\perl\vendor\lib/Archive/Zip/NewFileMember.pm line 60 Archive::Zip::NewFileMember::_readRawChunk('Archive::Zip::NewFileMembe +r=HASH(0x1af508c)', 'SCALAR(0x12db3a4)', 13787) called at C:\strawber +ry\perl\vendor\lib/Archive/Zip/Member.pm line 788 Archive::Zip::Member::readChunk('Archive::Zip::NewFileMember=HASH(0x1a +f508c)', 32768) called at C:\strawberry\perl\vendor\lib/Archive/Zip/M +ember.pm line 1063 Archive::Zip::Member::_writeData('Archive::Zip::NewFileMember=HASH(0x1 +af508c)', 'IO::File=GLOB(0x17d4484)') called at C:\strawberry\perl\ve +ndor\lib/Archive/Zip/Member.pm line 1030 Archive::Zip::Member::_writeToFileHandle('Archive::Zip::NewFileMember= +HASH(0x1af508c)', 'IO::File=GLOB(0x17d4484)', 1, 181272) called at C: +\strawberry\perl\vendor\lib/Archive/Zip/Archive.pm line 402 Archive::Zip::Archive::writeToFileHandle('Archive::Zip::Archive=HASH(0 +x1afc8c4)', 'IO::File=GLOB(0x17d4484)') called at C:\strawberry\perl\ +vendor\lib/Archive/Zip/Archive.pm line 438 Archive::Zip::Archive::overwriteAs('Archive::Zip::Archive=HASH(0x1afc8 +c4)', 'EOC-triage.zip') called at gatherTriage2.pl line 51 Can't write to C:\DOCUME~1\user\LOCALS~1\Temp\prAo0rBZe7.zip at gatherTriage2.pl line 51 Uncaught exception from user code: Unable to Write Zip - 4 : at gatherTriage2.pl line 52
From what I can determine, situation 1 occurs when I delete all of the old HTML and Zip files then try from scratch. Situations 2 and 3 seemingly occur at random whenever previously created HTML and Zip files are present.
Oddly, when I create a standalone script to go through all of the files using just Archive::Zip I don't seem to have this problem.
use strict; use warnings; use Archive::Zip qw( :ERROR_CODES :CONSTANTS); my @files = <*>; my $arch = Archive::Zip->new(); my $file = 'test.zip'; $arch->addFile($_) for (@files); my $status = $arch->overwriteAs($file);
I tried duplicating this in the original script without success.
I have a feeling that there is something simple that I'm overlooking, but I've gone through the code and through the Archive::Zip CPAN page quite a few times now without success. My other idea is that the html files aren't being written completely, but the grep seems to indicate otherwise.
Any help is appreciated.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: WWW::Mechanize::Firefox and Archive::Zip Issue
by bobross419 (Acolyte) on Nov 07, 2011 at 02:45 UTC |