in reply to Best way to look-up a string amongst 100 million of its peers

If your text file is constant, I guess that CDB_File is a possible solution if you have more diskspace than RAM. If you have a cluster of machines, most likely Cache::Memcached is a good solution, because it lets you keep the whole file (distributed) in RAM and search it relatively quickly.

  • Comment on Re: Best way to look-up a string amongst 100 million of its peers

Replies are listed 'Best First'.
Re^2: Best way to look-up a string amongst 100 million of its peers
by perrin (Chancellor) on Mar 25, 2008 at 20:41 UTC
    BerkeleyDB is actually faster than CDB_File, as seen in previous benchmarks on PerlMonks. I wouldn't recommend using memcached for anything where lost data would be considered a problem.

      I only found SQLite vs CDB_File vs BerkeleyDB, and there CDB_File came out on top of BerkeleyDB where it entered the contest. On the other hand, that benchmark is from 2002, and CDB_File didn't make it into the second round due to build problems on Windows.

        In the followup by demerphq, he totally destroyed that benchmark just by switching BerkeleyDB to BTree.