Re: Re: Re: Berkeley DB performance, profiling, and degradation...

I say it's surprising because a hash algorithm is supposed to maintain a fairly constant lookup time when you put more data into it. Maybe switching between the hash and BTree options of DB_File would make a difference.

I have used BerkeleyDB with the 3.x series from Sleepycat pretty extensively. The main advantages it offers are in the area of fancier locking and caching. With a single writer and the data on a RAM disk, these aren't likely to make much difference. It's worth a shot though.

Comment on Re: Re: Re: Berkeley DB performance, profiling, and degradation...

Replies are listed 'Best First'.
Re: Re: Re: Re: Berkeley DB performance, profiling, and degradation... by SwellJoe (Scribe) on Feb 19, 2002 at 22:46 UTC
This was my assumption as well (that lookups should be roughly constant at some point). But clearly it is not so. I've already tried switching to BTREE with no measurable result--I think having the db in RAM nullifies all of the tweaks that are available (like cachesize, etc.). One thing I have thought of, which might be helpful, is that I already have a hash value which is my key in the database. As I understand it, the Berkeley DB then creates a new hash derived from my key to store the object. Any chance I could use my own hashes as record numbers or similar? (The hash I have for a key is a 32 byte MD5, which matches the Squid hash key for a given object.) Would avoid the key generation part of the STORE and FETCH. Might not be a benefit though...Will worry more about it if SDBM_File doesn't fix my problems.	[reply]

Replies are listed 'Best First'.

Re: Re: Re: Re: Berkeley DB performance, profiling, and degradation...
by SwellJoe (Scribe) on Feb 19, 2002 at 22:46 UTC

I've already tried switching to BTREE with no measurable result--I think having the db in RAM nullifies all of the tweaks that are available (like cachesize, etc.).

One thing I have thought of, which might be helpful, is that I already have a hash value which is my key in the database. As I understand it, the Berkeley DB then creates a new hash derived from my key to store the object. Any chance I could use my own hashes as record numbers or similar? (The hash I have for a key is a 32 byte MD5, which matches the Squid hash key for a given object.) Would avoid the key generation part of the STORE and FETCH. Might not be a benefit though...Will worry more about it if SDBM_File doesn't fix my problems.

[reply]