I am a bit surprised that 2.1 GB of unzipped data should exhaust the resources of a 64-bit machine with 64 GB memory. My overall gut-feeling is that, even with the overhead involved, it should fit.

Having said that, one solution might be to ask the operating system to do the unzipping, redirecting the output to your Perl program reading its input line by line. I am doing this quite regularly in a somewhat similar (though different) situation for input files having at least the same order of magnitude (and sometimes significantly more): having a ksh or bash script (my main program) making some initial operations on the input file (for example sorting the data) and piping the output to my Perl program doing line by line reading and doing all the further transformations needed. I found that to be a pretty efficient method, both in terms of memory usage and data volume throughput.

In some cases, if I remember accurately, I even piped under ksh an unzipping operation, a Unix sort and a Perl program, and I don't remember having exceeded the platform's quotas or limits (well, I sometimes did, but it had to do with wrong ulimit parameters and similar configuration warts; with the proper system configuration, it worked in my experience). So, I would think that this method could apply equally well to your case, although there might obviously be different conditions, one of which is obviously the ability of the decompressing utility to send output to STDOUT raher that to a physical file (not sure which ones can or can't do that).

Hope this helps.


In reply to Re: Unzip file to scalar, treat scalar as a file by Laurent_R
in thread Unzip file to scalar, treat scalar as a file by bwilli27

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.