in reply to Correct Perl settings for sending zipfile to browser

Personally, I would go about this differently: Instead of writing a file to disk, zipping it, and then re-reading it for download, you can do it all on the fly*. (If you did want to do it via temporary files like in your current script, please read my nodes on File::Temp examples and running external programs, as there are potential security and concurrency issues with your current script.)

IO::Compress::Zip is a core module, and you can use it to generate and output a ZIP file on the fly (see its docs for details):

use warnings; use strict; use IO::Compress::Zip qw/$ZipError/; my @lines = (qw/ Hello World Foo Bar /); my $eol = "\r\n"; binmode STDOUT; # just to play it safe my $z = IO::Compress::Zip->new('-', # STDOUT Name => "Filename.txt" ) or die "zip failed: $ZipError\n"; for my $line (@lines) { $z->print($line, $eol); } $z->close();

It's also possible to write the ZIP file to a scalar, e.g. if you need to know its length before writing it out, although that of course increases the memory usage. At the very least, you don't need to buffer the output lines like you're doing in your current script with @resp.

* OTOH, I agree with cavac that if these files are going to be unchanged across multiple downloads, it'd certainly be more efficient to not re-generate them on every request and use appropriate HTTP caching methods instead.

Update: Minor edits.

Replies are listed 'Best First'.
Re^2: Correct Perl settings for sending zipfile to browser
by Anonymous Monk on Nov 15, 2019 at 04:18 UTC

    Implementing your example with my code, where @lines is changed to @resp, and where I remove the $CRLF from the lines going into the array, results in the following two errors in the log file, and the browser gives an Internal Server Error.

    ...AH01215: Wide character in IO::Compress::Zip::write...

    ...malformed header from script '___.pl': Bad header: PK\x03\x04\x14,...

    In this case, the file will very likely change every time it is downloaded, as the database is regularly updated. I experimented earlier on multiple downloads during a time when I knew that no one was logged in to the database to make changes, but ordinarily change may be expected. That means, if I could get a direct download like you suggest to work, it would be a perfect solution in this case.

      the browser gives an Internal Server Error

      The code I showed isn't a complete CGI example, since it doesn't output the headers, so those would need to be added back in. Since in the original code those are being written by hand, I'd suggest at least upgrading to one of the CGI modules, such as e.g. CGI::Simple, to generate those for you.

      Wide character in IO::Compress::Zip::write

      That would mean that there's Unicode in your @lines. (Although I don't see an encoding being set on TARGET in the original code, so I think it would have the same issue?) Anyway, although IO::Compress::Zip provides a filehandle-like interface, it looks like it doesn't (yet?) support encoding layers. A manual encoding with Encode does work though:

      use warnings; use strict; use IO::Compress::Zip qw/$ZipError/; use Encode qw/encode/; my @lines = (qw/ Hello World Foo Bar /, "\N{U+1F42A}"); my $eol = "\r\n"; my $encoding = "UTF-8"; # or maybe "CP1252" for Windows binmode STDOUT; # just to play it safe my $z = IO::Compress::Zip->new('-', # STDOUT Name => "Filename.txt" ) or die "zip failed: $ZipError\n"; for my $line (@lines) { $z->print( encode($encoding, $line.$eol, Encode::FB_CROAK|Encode::LEAVE_SRC) ); } $z->close();

      Note: For encodings such as UTF-16, it seems encode adds a Byte Order Mark for every string it encodes, and I don't see an option in the module to disable that. One way to get rid of them is to remove them manually, but an alternative might be to replace the for loop with this, at the expense of higher memory usage: $z->print( encode($encoding, join('', map {$_.$eol} @lines), Encode::FB_CROAK|Encode::LEAVE_SRC) ); - or just stick to UTF-8, as that's pretty ubiquitous.

      Update:

      I remove the $CRLF from the lines

      You can leave that in and remove my $eol, as they're the same thing (I missed that on my first read of the original source, sorry).

        Thank you, thank you!

        I'm now making some progress. I could live with it as-is, I think, but would like, if possible, one more improvement: The Content-Length. I suppose, however, that is not possible when compressing on the fly.

        I was already using Encode, so there was little more to adjust. I have now downloaded a zipped file successfully, which can be opened normally, with the following code.

        sub exportdatabase { fork: { my ($recnum,$revnum,$book,$chap,$verse,$text) = ''; my @resp = (); my $timestamp = "$curdate_$curtime"; $timestamp =~ s/[\/:.]/-/g; my $to_windows = ''; my $CRLF = "\n"; if ($OS eq "Windows") { $to_windows = '--to-crlf'; # SAME AS -l $CRLF = "\r\n"; } my $zipfile = "$db_export_file.zip"; my $encoding = "UTF-8"; $statement = qq| SELECT a.RecordNum, a.RevisionNum, a.Book, a.Chapter, + a.Verse, a.Text from $table a INNER JOIN (SELECT RecordNum, max(Revi +sionNum) RevisionNum FROM $table GROUP BY RecordNum) b USING (RecordN +um,RevisionNum); |; &connectdb('exportdatabase'); push @resp, "RECORD#\tREVISION#\tBOOK#\tCHAP#\tVERSE#\tTEXT, AS EDITED + BY: $curdate $curtime (Pacific Time)$CRLF"; while (($recnum,$revnum,$book,$chap,$verse,$text) = $quest->fetchrow_a +rray()) { push @resp, "$recnum\t$revnum\t$book\t$chap\t$verse\t$text +$CRLF"; } binmode STDOUT; # just to play it safe print qq|Content-Type: application/zip, application/octet-stre +am$CRLF|; print qq|Cache-Control: no-cache, no-store, must-revalidate$CR +LF|; print qq|Accept-Ranges: bytes$CRLF|; print qq|Content-Language: utf8$CRLF|; #print qq|Content-Length: | . (stat $zipfile)[7] . "$CRLF"; print qq|Content-Disposition: attachment; filename="$zipfile";$CRL +F$CRLF|; my $z = IO::Compress::Zip->new('-', # STDOUT Name => "$db_export_file" ) or die "zip failed: $ZipError\n"; for my $line (@resp) { $z->print( encode($encoding, $line, Encode::FB_CROAK|Encode::LEAVE_SRC) ); } $z->close(); } #END fork } # END SUB exportdatabase

        Without the Content-Length header, the client does not know how large the file being downloaded is, nor how long it will take. But, at least the file arrives intact!