in reply to maddening system call gzip/gunzip problems

Your errors fall into two categories.

1. Gunzip won't do anything with files that don't have the correct extension, so running gunzip on the file 'cns061206' (for example) doesn't work. If you want to uncompress something that doesn't have a .gz extension, try zcat cns061206 > newfilename instead.

2. You are attempting to compress files that already have a .gz extension, so gzip won't try and compress it again. Without more details, I would guess that either you didn't unzip these files in the first part of the program, or that some external system has continued to create new .gz files in the directory while you were processing.

Depending on what kind of processing you are doing, you might want to consider taking a better approach than 'unzip everything', 'process everything', 'zip everything'. For example, if you are not actually changing the contents, you can just run 'zcat' on the file and read the output, using something like open( IN, "zcat $file |" ) (or, better yet, something like Compress::Zlib.) If you are modifying the files, you might consider processing them one at a time, rather than a directory at a time, which could avoid problems with new files being created while your program is running.


We're not surrounded, we're in a target-rich environment!

Replies are listed 'Best First'.
Re^2: maddening system call gzip/gunzip problems
by neilwatson (Priest) on Dec 07, 2006 at 17:01 UTC
    Although, gunzip reports that the file has no .gz extension it did before the script was run. That file was a gzip file before the script started. Similarly, the files are all unzipped before the gzip part is run. It seems that the script attempts to gunzip or gzip some files twice. I would love to rewrite all of the processing code for this script. Alas, time constraints do not permit (the rest of the code is very poor, undocumented, and not mine).

    Neil Watson
    watson-wilson.ca

      You are using readdir() to iterate over all of the files in the directory, but adding files to the directory inside of your loop. Remember that g(un)zip creates a new file to receive the output of the (de)compression. This is why things appear to be getting processed twice.


      The intelligent reader will judge for himself. Without examining the facts fully and fairly, there is no way of knowing whether vox populi is really vox dei, or merely vox asinorum. — Cyrus H. Gordon
        I know that system does fork but the parent waits. How can the script continue before the fork of gzip finishes clean up?

        Neil Watson
        watson-wilson.ca