in reply to Unzip file to scalar, treat scalar as a file

I am a bit surprised that 2.1 GB of unzipped data should exhaust the resources of a 64-bit machine with 64 GB memory. My overall gut-feeling is that, even with the overhead involved, it should fit.

Having said that, one solution might be to ask the operating system to do the unzipping, redirecting the output to your Perl program reading its input line by line. I am doing this quite regularly in a somewhat similar (though different) situation for input files having at least the same order of magnitude (and sometimes significantly more): having a ksh or bash script (my main program) making some initial operations on the input file (for example sorting the data) and piping the output to my Perl program doing line by line reading and doing all the further transformations needed. I found that to be a pretty efficient method, both in terms of memory usage and data volume throughput.

In some cases, if I remember accurately, I even piped under ksh an unzipping operation, a Unix sort and a Perl program, and I don't remember having exceeded the platform's quotas or limits (well, I sometimes did, but it had to do with wrong ulimit parameters and similar configuration warts; with the proper system configuration, it worked in my experience). So, I would think that this method could apply equally well to your case, although there might obviously be different conditions, one of which is obviously the ability of the decompressing utility to send output to STDOUT raher that to a physical file (not sure which ones can or can't do that).

Hope this helps.

  • Comment on Re: Unzip file to scalar, treat scalar as a file