Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

I am working in a perl script where I am going to store a large amount of data in a Hash variable. I dont know what is the maximum elements storage in a hash variable. I am afriad that it would cause loss of data in a hash variable due to overflow.

Kindly send your suggestions.

thank you.

  • Comment on what is the maximum elements storage in a Hash or an array variable

Replies are listed 'Best First'.
Re: what is the maximum elements storage in a Hash or an array variable
by davido (Cardinal) on Jan 28, 2006 at 09:21 UTC

    Perl does not impose arbitrary limits on the size of its datastructures. You could pull a 10 gigabyte file into a single scalar variable if your operating system supported such large files, and your available memory and swapspace could handle it.

    But just because you can read the entire phone book from the state of California into memory at once doesn't mean you should. And just because Perl would let you doesn't mean your system can handle it. There won't be an "overflow", but there could be a lot of hard drive churning as the swapfile fills, lots of sluggish system behavior, and eventually some sort of error message letting you know you have completely saturated your system's resources.

    If your datastructure has the potential of growing beyond your system's capability for handling it efficiently, you'll have to look for other solutions such as only holding a portion of the data in memory at a given time, or using a database.


    Dave

Re: what is the maximum elements storage in a Hash or an array variable
by salva (Canon) on Jan 28, 2006 at 16:38 UTC
    That limit will be imposed by the RAM available on your machine... though if it is not enough, you can always change your script to use an on disk tree with DB_File or similar module.
Re: what is the maximum elements storage in a Hash or an array variable
by TorontoJim (Beadle) on Oct 10, 2017 at 18:40 UTC

    I know this is a really old post ... but ... I just ran a script that created 20 million 192 byte tokens and wrote them to a file as well as adding them to a hash as keys. I'm happy to say there were no duplicates and my Strawberry Perl had no problem handling a hash with 20 million keys, although it sure did slow down my desktop computer while it was running.

    Perl rawks ... !

    Unfortunately, the resulting 3.8 Gb file was too big for any of my text editors to open, so I rewrote it to create multiple smaller files. If only Microsoft stuff had been created by Larry Wall ... just sayin'.

      You may be interested in DBM::Deep, which can map a memory hash to disk. It can also write files much larger than 4GB, on 64bit systems (see Large File Support).

      It's also slower than a pure in memory hash, but the on-disk hash is persistent, and may perform better than a disk-swapping context.

      (I'm not associated with DBM::Deep, but I used it extensively some years ago, and it saved a lot of time and trouble. So I always try to give back and suggest it where appropriate.)

      -QM
      --
      Quantum Mechanics: The dreams stuff is made of