in reply to Iterating through HUGE FILES

The part of the code you show here is good.

Is your perl compiled with large (that is >2Gb) file support?

You can tell by doing

> perl -V:uselargefiles uselargefiles='define';
If the output isn't uselargefiles='define'; you need to get or compile another perl binary. Uselargefiles is an option you need to set when compiling the perl interpreter, though the default in recent perls ( I believe since 5.8.0 ) is to turn it on.

Replies are listed 'Best First'.
Re^2: Iterating through HUGE FILES
by dynamo (Chaplain) on May 10, 2005 at 21:17 UTC
    So that WAS the problem. I know there was some compiled-in limit. Thank you for mentioning this.

    If this is the problem (which seems likely) with the original poster's code, again I suggest using other utilities to break up the data set into manageable chunks, and then processing those chunks in perl.

      Other utilities, like say: cat HUGE | perl my_script.pl

      ...This is, of course, bait for Merlyn to jump all over :-)

      Seriously though, can you simply read from STDIN ? Then your Perl shouldn't care how big the file is.

Re^2: Iterating through HUGE FILES
by jmaya (Acolyte) on Feb 19, 2006 at 02:46 UTC
    It is activestate's perl I think it was not compiled with that parameter. Thank you

      Which version of AS Perl? It must be petty ancient as the last 7 or 8 version (at least) have been built with large file support. On win32 anyway. It's easy to forget that they also produce binaries for other OSs.

      If you cannot upgrade for any reason, then I second the idea of using a system utility to read the file and pipe it into your script. I'd probably do it using the 'piped open'. If you need to re-write the data, send it to stdout and redirect the output via the command line.

      die "You didn't redirect the output" if -t STDOUT; open BIGFILE, "cmd/c type \path\to\bigfile |" or die $!; while( <BIGFILE> ) { ## do stuff } close BIGFILE; __END__ script bigfile.dat > modified.dat

      Dying if STDOUT hasn't been re-directed is a touch that you'll appreciate after the first time you print a huge binary file to the console by accident. The bells! The bells! :)


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.