mdog has asked for the wisdom of the Perl Monks concerning the following question:

Brethern --

Is there a way in perl (natively) to be able to read / grep the content out of a compressed text file without first decompressing the compressed file into text file and then reading it to do some sort of grep against it?

Basically, in perl, can you take the uncompressing stream of data and pass it into grep?

Background: I say natively because I am on windows and although I do have gunzip.exe and grep.exe (and perl grep), pipes don't work.

I need to leave the file compressed when I am done so if I didn't have to recompress the file, that would save time (thus the stream of data reference). Of course, I could copy the compressed file, uncompress that, and delete both when done but that seems half-assed.

The compressed file is a gzipped text file (the -9 compression is great) but I could store it as .zip or anything else that will shrink it down significantly.

Many thanks,
mdog

Replies are listed 'Best First'.
Re: Getting Text From A Compressed File
by Zaxo (Archbishop) on Apr 01, 2005 at 01:34 UTC

    If you have PerlIO enabled (default in 5.8+), you can use PerlIO::gzip:

    use PerlIO::gzip; open my $fh, '<:gzip', '/path/to/file.gz' or die $!; while (<$fh>) { # do stuff } close $fh or die $!;

    Update: Typo fixed, thanks, Roy Johnson++.

    After Compline,
    Zaxo

Re: Getting Text From A Compressed File
by moot (Chaplain) on Apr 01, 2005 at 01:01 UTC
Re: Getting Text From A Compressed File
by polettix (Vicar) on Apr 01, 2005 at 16:55 UTC
    My curiosity (I'm a Linux user): what does it mean that pipes don't work? Isn't it possible to do something like:
    open CLEARDATA, "gunzip -c $filename |"; while (<CLEARDATA>) { /^my.*pattern$/ && print }
    in Windows? I thought there was some kind of work-around for this.

    TIA,

    Flavio (perl -e "print(scalar(reverse('ti.xittelop@oivalf')))")

    Don't fool yourself.
      You are absolutely correct...that does work!

      When I said "pipes don't work", I meant that you can't pipe the output of one program directly into the input of another using the normal *NIX way of "|".

      I actually did not think about messing with the filehandles the way you did so many thanks (and ++). Learned another great thing here on perlmonks!