in reply to Hashes -- Core dump

There is a limit - your memory, virtual and physical. 700,000 elements is a lot. You should probably use a dbm module, like DB_File, to store the hash on disk and read it from there - not keeping the whole thing in memory.

Due to the way hashes are implemented they usually take up a lot of space, probably more than is needed to store 700,000 elements in your case, i guess. This is because hashes perform some manipulation on the key, and then end up with an index to store in the table. If all the keys end up distributed pretty far from one another, there's a lot of space left unused, but still allcated.

Perl's builtin tie function lets you access data using hashes, arrays or scalars, but with the internals differring, so that you may, for example, crss your memory limits. Examples can be found in the above mentioned module, aswell as more reference in AnyDBM_File.

-nuffin
zz zZ Z Z #!perl

Replies are listed 'Best First'.
Re: Re: Hashes -- Core dump
by Anonymous Monk on Oct 23, 2002 at 15:05 UTC
    Hi nuffin, Thanks much for an almost instant reply. I used NBDM_File and tie and tried the following snippet of code. It still cores when it reaches foreach loop (to sort and print the hash to a file)If i take the foreach loop out, it works fine. I need your help again...I want to sort the hash in ascending order of numeric key and print to a file. Thank you!!!
    use NBDM_File; tie(%h,NDBM_File,'test_tie.tmp',O_RDWR/O_CREAT, 0640); while (<INPUT>) { $id = substr($_, 9, 11); if (! exists($h{$id}) ) { $h{$id} = $_; } else { .... .... } } foreach (sort keys %h) { print OUTPUT $h{$_}; } untie %h;
      When you sort keys you are loading an array of keys to the memory, and sorting it, which can take up a lot again. I have the most perfect solution for you...
      use strict; use warnings; use DB_File; my $num_order_btree = new DB_File::BTREEINFO; $num_order_btree->{compare} = sub { $_[0] <=> $_[1] }; tie my %h, 'DB_File', 'test_tie.tmp', O_RDWR|O_CREAT, 0640, $num_order +_btree; while (<INPUT>){ $id = substr($_,9,11); unless (exists($h{$id})){ $h{$id} = $_; } else { } } while (defined ($_ = each %h)){ print OUTPUT $h{$_}; } untie %h;


      -nuffin
      zz zZ Z Z #!perl