Build your own index to the data.
Use the MD5 of the key (binary 128-bits) + the file position (64-bits) = 24bytes * ~= 500 million records.
11 GB index file.
Sort by the MD5.
With fixed length records, writing a binary chop to locate the record's offset is relatively easy and gives you ~log(n) access time.
Still pushes you beyond your 40GB disk, but 60GB disks aren't that much more exspensive.
In reply to Re: size on disk of tied hashes
by BrowserUk
in thread size on disk of tied hashes
by danderson
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |