in reply to Re^4: Fast(er) serialization in Perl
in thread Fast(er) serialization in Perl

is 'NM_001005525' different from 'NM_1005525' ?

if not you have a serious bug...

if yes

90% of your keys have 6 digits putting this data into an array with 1 million entries seems reasonable... resulting in 2 MB of memory consumption if you can limit your values to 64536 numbers (it's a counter isn't it?)

the other keys have 9 digits, so it seems you are coding your genes in groups of 3 digits. All of them start with "001"

So generally - from what you show - a hash of arrays seems reasonable where the hash key represents the first 3 digits and the array the rest.

Replies are listed 'Best First'.
Re^6: Fast(er) serialization in Perl
by mrguy123 (Hermit) on Apr 11, 2010 at 15:16 UTC
    OK, I understand what you are saying.
    I will try this direction, and hope I manage to save some time.
    Thanks for your help
      BTW: do you use this hash data read-only? if not, you should care about simultaneous calls of your script. And if your only accessing a relatively "small" number of entries, better chose one of the flat-file DB solutions mentioned above.