in reply to Parsing a large file 80GB .gz file and making an output file with specifice columns in this original file.

My approach is usually to parallelize the decompression and the string handling by using the two-argument form of open:

my $filename= "some.file.gz"; my $cmd= "gunzip -cd '$filename' |"; open my $fh, $cmd or die "Couldn't decompress '$filename' via [$cmd]: $! / $?";

This piped approach also works nicely with transfers over ssh connections.

  • Comment on Re: Parsing a large file 80GB .gz file and making an output file with specifice columns in this original file.
  • Select or Download Code

Replies are listed 'Best First'.
Re^2: Parsing a large file 80GB .gz file and making an output file with specifice columns in this original file.
by salva (Canon) on Jul 16, 2013 at 11:47 UTC
    why would you want to use the two-argument form when you can use the five-argument form?
    open my $fh, '-|', 'gunzip', '-cd', $filename or die "Couldn't decompress '$filename' via gunzip: $! / $?";

      I use the two-argument form because the three-argument form fails for me on Windows, unfortunately:

      >perl -we "my $fn= shift; open fh, '|-', 'gunzip', '-cd', $fh or warn +$!" foo.gz List form of pipe open not implemented at -e line 1.