Hrunting has asked for the wisdom of the Perl Monks concerning the following question:

I have a project I'm working on that takes several files, tars them up, and sends them to a client via HTTP. Right now, I create the files, use Archive::Tar to write the tar file to a temporary file (created using IO::File->new_tmpfile()), and then print the temporary file to the browser. That works. I really don't have much of a problem with it, but what I'd really like to do is cut out the middle man tar file and just have the tarfile written directly to STDOUT.

First off, I don't even know if that will work, not really understanding how Archive::Tar itself works. When I try to pass \*STDOUT to Archive::Tar->create_archive() (which takes the name of a file or a reference to filehandle/glob, according to the documentation), I get:

Can't call method "gzwrite" without a package or object reference at / +usr/local/lib/perl5/site_perl/5.6.0/Archive/Tar.pm line 521.
Line 521 in Archive::Tar is:
$file->gzwrite ("\0" x 1024)
And that filehandle is the result of (from Archive::Tar):
$fh = Compress::Zlib::gzdopen_ ($fh, $mode, 0)
The thing is, nothing seems weird. The filehandle should be opened on the fileno for STDOUT and things should progress along smoothly, with all the data being written to STDOUT (in my case, the browser). Is there something in the web server or Zlib which prevents all this from taking place or can anyone think of a better way to do what I'm doing? I'm really at a loss as to wear to begin working towards a solution.

Thanks.

Replies are listed 'Best First'.
Re: Tar File To Web Browser
by repson (Chaplain) on Dec 10, 2000 at 14:17 UTC
    In my Activeperl docs for perl 5.6.0, Archive::Tar says this:

    write('file.tar',$compressed)
    Will write the in-memory archive to disk. If no filename is given, returns the entire formatted archive as a string, which should be useful if you'd like to stuff the archive into a socket or a pipe to gzip or something. If the second argument is true, the module will try to write the file compressed.

    So simply use the multistage set of commands like it says in the docs with a slight change:

    use Archive::Tar; $tar = Archive::Tar->new(); $tar->read("origin.tar.gz",1); # you don't seem to want this line thou +gh $tar->add_files("file/foo.c", "file/bar.c"); $tar->add_data("file/baz.c","This is the file contents"); # $tar->write("files.tar"); print STDOUT $tar->write(); # make sure http headers are already print +ed before running this line
    Note: I have not tested this but if the docs are correct this should work without mucking around with references to STDOUT. If it still doesn't work it is probably a problem with your copy of the module.
      The problem with this is that unless you pass a file or filehandle to write(), it doesn't actually return tarfile data (at least that's what I'm seeing in Archive::Tar's code) is that the actual data isn't returned, just the formatted listings. The actual routine to write the tar file is only called if you pass a reference to a filehandle or a filename. When I try passing in a reference to STDOUT, I get the same error as before (makes sense, they call the same internal routines).
Re (tilly) 1: Tar File To Web Browser
by tilly (Archbishop) on Dec 10, 2000 at 23:20 UTC
    If you have tar locally installed then another solution may be a pipeline. The exact options depend on the version of tar. For instance to produce a gzipped tar file using GNU tar try the following:
    my $cmd = "tar -T - -cvz -"; open (TAR, "| $cmd") or die "Cannot run '$cmd': $!"; print TAR map "$_\n", @files; close TAR;
    One nice thing about this is that this is incremental. Data starts streaming immediately without having to store all of the data in memory.
The Problem & My Solution (please comment)
by Hrunting (Pilgrim) on Dec 11, 2000 at 00:49 UTC
    The problem lies in the _get_handle() routine in Archive::Tar.
    sub _get_handle { my ($fh, $flags, $mode); sysseek ($_[0], 0, 0); or goto &_drat;
    If the filehandle is STDOUT (already opened by web server), that sysseek fails, and _get_handle returns nothing, leading to the errors above. Running this under tests from a normal commandline (thanks Fastolfe), STDOUT is opened freshly and the sysseek succeeds (which is why a lot of tests from the commandline doing the exact same thing worked).

    Removing the goto (?!) appears to let sysseek silently fail. If it can get to the beginning, it will, no harm, no foul. If it can't (like in the environment I'm working in), then it just continues on and prints the data successfully. Does that solution seem okay? What are the pitfalls with going with this solution?

      I think a more correct solution (and this isn't something you or I really need to worry about, but the author), might be to do a test against $fh to see if it is pointing to STDOUT, and if so, ignore the seek entirely. I can't think of any cases off the top of my head where a failed seek would be a bad thing in this place, so it might be perfectly OK to simply disregard the return value of this function and let it fail silently if it needs to.
        Are there any cases where a failed seek would be bad?
      I ran into this a while back, so the details are fuzzy; merlyn had an article in WebTechniques (I hope; found it Programming w/ Perl) which he talked about doing something like this, and the exercise to the reader was to improve it to use compress. I tried and ran up against the above problem. The Author of Archive::Tar was quite helpful, but I don't recall (sorry) the result (use a type glob?) but you could probably contact him for a fix. I believe I did something like the above, hide the seek (ahem) and lived w/ it.

      Update I found the Author's reply (here for the heck of it):
      --- quote
      From: Stephen Zander gibreel@pobox.com>
      To: Andy Bach root@wiwb.uscourts.gov>
      Subject: Re: Archive::Tar

      >> "Andy" == Andy Bach root@wiwb.uscourts.gov> writes:
      We just send back an http header saying tar.gzip file to follow and dump the output to STDOUT and the browser handles the saving, but I couldn't get $tar->write() to use stdout easily, w/ compression on.

      Try using ->write(\*STDOUT) to force the output to the pre-existing STDOUT handle. This should Just Work. If it doesn't I'd appreciate the output of any error logs from apache and information on your environment (eg, apache version perl version, Archive::Tar version, whether you're using mod_perl etc)

      Thanks
      Stephen

      "Farcical aquatic ceremonies are no basis for a system of government!"
      --- end quote

      a