Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello, My perl script trying to populate a hash core dumps(dies with 'Illegal Instruction(coredump)' message) once its size crosses 700,000 elements. Is there a limit on the size of a hash. If not, could you please tell me why it is failing. Thanks a lot.

Replies are listed 'Best First'.
Re: Hashes -- Core dump
by nothingmuch (Priest) on Oct 22, 2002 at 21:24 UTC
    There is a limit - your memory, virtual and physical. 700,000 elements is a lot. You should probably use a dbm module, like DB_File, to store the hash on disk and read it from there - not keeping the whole thing in memory.

    Due to the way hashes are implemented they usually take up a lot of space, probably more than is needed to store 700,000 elements in your case, i guess. This is because hashes perform some manipulation on the key, and then end up with an index to store in the table. If all the keys end up distributed pretty far from one another, there's a lot of space left unused, but still allcated.

    Perl's builtin tie function lets you access data using hashes, arrays or scalars, but with the internals differring, so that you may, for example, crss your memory limits. Examples can be found in the above mentioned module, aswell as more reference in AnyDBM_File.

    -nuffin
    zz zZ Z Z #!perl
      Hi nuffin, Thanks much for an almost instant reply. I used NBDM_File and tie and tried the following snippet of code. It still cores when it reaches foreach loop (to sort and print the hash to a file)If i take the foreach loop out, it works fine. I need your help again...I want to sort the hash in ascending order of numeric key and print to a file. Thank you!!!
      use NBDM_File; tie(%h,NDBM_File,'test_tie.tmp',O_RDWR/O_CREAT, 0640); while (<INPUT>) { $id = substr($_, 9, 11); if (! exists($h{$id}) ) { $h{$id} = $_; } else { .... .... } } foreach (sort keys %h) { print OUTPUT $h{$_}; } untie %h;
        When you sort keys you are loading an array of keys to the memory, and sorting it, which can take up a lot again. I have the most perfect solution for you...
        use strict; use warnings; use DB_File; my $num_order_btree = new DB_File::BTREEINFO; $num_order_btree->{compare} = sub { $_[0] <=> $_[1] }; tie my %h, 'DB_File', 'test_tie.tmp', O_RDWR|O_CREAT, 0640, $num_order +_btree; while (<INPUT>){ $id = substr($_,9,11); unless (exists($h{$id})){ $h{$id} = $_; } else { } } while (defined ($_ = each %h)){ print OUTPUT $h{$_}; } untie %h;


        -nuffin
        zz zZ Z Z #!perl