Re: Speeding a disk based hash

In many situations, a BTree can be much faster than hashes. BerkeleyDB supports them as well. Also if you're exceeding Perl's memory requirements by just a bit, you can tie a hash to an in-memory database. BerkeleyDB should be more memory-efficient than Perl and so it might fit in RAM when Perl's native data structures do not.

Beyond that, at Re: size on disk of tied hashes I gave an explanation of some of the performance problems with dealing with disk on large datasets, and briefly discussed some of the options that are available. Note that if you care to benchmark your application, you do not want to benchmark it with random data. Do it with a sample of real data. Disk performance is strongly affected by your access pattern, and real world access patterns are not very random (else caching would not be a good idea).

Comment on Re: Speeding a disk based hash