Spida has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to add some errorhandling code to Update local FreeDB copy
This code downloads a foo.tar.bz2 and extracts this to a certain $directory.
$directory has subdirectories with about 700000 files in them, foo.tar.bz2 adds and updates about 15000 of them.
How do I check if
- foo.tar.bz2 has been download completely
- foo.tar.bz2 has been untared completely
Can I check if a foobar.tar.bz2 has been untared completely without downloading the entire file?

edited: Fri Nov 1 15:14:57 2002 by jeffa - linkafied the node

  • Comment on Has foo.tar.bz2 been extracted to $dirrectory?

Replies are listed 'Best First'.
Re: Has foo.tar.bz2 been extracted to $dirrectory?
by bluto (Curate) on Nov 01, 2002 at 17:12 UTC
    - foo.tar.bz2 has been download completely

    The '-t' or '--test' flags of bzip2 check the integrity of the compressed file. If there is an MD5 checksum, you can generate and compare that as well. I'm assuming GNU tar, which you may be using(?), will do this with the 'j' flag, though you may want to test for this.

    - foo.tar.bz2 has been untared completely

    Some tar's will remain silent and appear to untar files on filesystems that run out of space (and create zero length files to boot). Gnu's tar has a '--compare' option which I've never used but might be worth a try. If you aren't using Gnu tar, you'll probably need to examine the contents of the archive yourself (e.g. tar -tv).

    bluto

      I don't see a --compare in my manpage, but there is a -W / --verify.

      Makeshifts last the longest.

      The idea to use "--compare" is not practical for me since that would decompress and untar the whole archive again, compare it, and delete the temporay extracted archive. Thats because of the size of the archive and because of the resources not possible.
      OTOH bzip2 -t foo.tar.bz2 works fast and fine *g*.
Re: Has foo.tar.bz2 been extracted to $dirrectory?
by waswas-fng (Curate) on Nov 01, 2002 at 20:33 UTC
    If you look at Archive::Tar and Compress::Bzip2 you can use both to test for completeness of your two required actions. C::Bzip2 returns undef on error (decompress) and Archive::Tar returns on error as well.

    -Waswas
      That would help me while executing my script, but not when the script has run and I don't know if it completed or not.
        Welp, you have really a few options, see if the app writing the file has a lock on it, or check to see if the size of the file is changing for x number of seconds. The C::Bzip2 module failing on decompress will tell you that A: the file is corrupt or B: the file is not all downloaded. You also could try to take the update of the file (Net::FTP etc) into your script so you KNOW when the transfer is complete or if it is a partial. IMHO this is the best route so you don't "spin cycles" checking to see if the file is complete, new or such.

        -Waswas