Re: BerkeleyDB vs. Linux file system

To state the obvious:

There is alot of difference between having 100 5MB documents and 5M 100B documents.

If your test doesn't give performance differences, look at other parameters:
Using the file system is invaluable during testing, you can just look at your files.
OTOH, backing up millions of little files is a pain compared to backing up one database file.

One more question: Do you need any kind of concurrency?
This could also influence your decision one way or the other.

Comment on Re: BerkeleyDB vs. Linux file system

Replies are listed 'Best First'.
Re: Re: BerkeleyDB vs. Linux file system by perrin (Chancellor) on Mar 18, 2003 at 15:57 UTC
I actually think that as long as you start splitting files across directories to keep from getting more than 1000 in a single dir, both file system and Berkeley would scale very far without a huge difference in performance. Remember, BerkeleyDB handles databases with terabytes of data. There are definitely many advabtages to having things in normal files, especially for text content, and it's the only choice for NFS or other file servers. I did use the DB_INIT_CDB flag, which initialized the concurrency methods. If I leave that off, BerkeleyDB gets faster, but you lose the ability to do concurrent access. I didn't think the test would be very interesting if it used options that didn't allow for concurrency.	[reply]

Replies are listed 'Best First'.

Re: Re: BerkeleyDB vs. Linux file system
by perrin (Chancellor) on Mar 18, 2003 at 15:57 UTC

There are definitely many advabtages to having things in normal files, especially for text content, and it's the only choice for NFS or other file servers.

I did use the DB_INIT_CDB flag, which initialized the concurrency methods. If I leave that off, BerkeleyDB gets faster, but you lose the ability to do concurrent access. I didn't think the test would be very interesting if it used options that didn't allow for concurrency.

[reply]