You should go to Sleepycat's website and read up on the documentation paying special attention to the C API (since that what Paul's BerkeleyDB module wraps). I would suspect you should consider altering some of the default caching strategies or another of the tuning options. Be sure not to miss some goodies like Access method tuning.
| [reply] |
I had a similar problem some time ago.
My bottleneck was not having
enough RAM. You can check this by using the
top utility while your program is running:
if your process is swapping you will see a "D" instead
of "R" in the process status, and the CPU use will
be very low.
| [reply] [d/l] |
I had a similar problem. I had around 10 million entries for a search engine I wrote. Are you accessing some of the keys more than once when loading the data? If so, you can save a ton of time by presorting the data. I went from 8 hours to 35 minutes.
-Lee
UPDATE
Another note, if you're using keys() on the hash, it will actually build a list of all the keys which will certainly thrash your box. Which is of course the correct behaviour. | [reply] |
| [reply] |