in reply to Perl's Hash vs BerkeleyDB vs MySQL

The big advantages of BerkeleyDB are:

So whenever I want a hash that persists on disk because my RAM isn't big enough to hold all of it, or when I want the data to persist between invocations of the program, I use a tied hash, that maps a Perl hash to a BerkeleyDB.

Another case is when I want multiple instances of a program to (infrequently) write data and the dangers of conflicts are relatively small - then BDB also is convenient.

In other cases, I use a plain Perl hash instead.

Replies are listed 'Best First'.
Re^2: Perl Hash vs BerkeleyDB vs MySQL
by pajout (Curate) on May 09, 2006 at 07:39 UTC
    I agree, but I see some disadvantages also:
    • Need for recovery when the process is not normally stopped.
    • Other conception of concurrency/locking, different from sql. It is necessary to know it preciselly, before app design.
    • No shared memory between processes (as sql machine can do).
      What is this "other conception of concurrency" you're referring to? BerkeleyDB allows either database level or page level locks. Page level locks are trickier, since you have to run the deadlock daemon. Database level locks are trivial and require no special knowledge, and perform well.

      I don't understand your shared memory comment. MySQL is implemented as multiple threads which share memory, but they don't share memory with your program. BerkeleyDB does use a shared memory cache, and it runs resident in your process, so you are accessing data directly from shared memory, unlike MySQL where you access it over a socket.

        My English is not perfect, surely :>)
        I'v mentioned "other conception", because necessity of deadlock daemon, for instance. In _my_ point of view it is up to developer to manage that deadlocks, sql machines behave in more standard way and they do it without additional effort. Aditionally, I know table-locking and row-locking in sql, but no page-locking.
        Ad "shared memory" - if you have two sepparated processes, using BerkeleyDB, there is no easy chance to share some memory with loaded data between those processes. Oppositely, when you have two sepparate processes using two connections into sql database, the sql machine can use shared memory to satisfy both connections (querying the same, of course).
Re^2: Perl Hash vs BerkeleyDB vs MySQL
by monkfan (Curate) on May 09, 2006 at 06:47 UTC
    Managed concurrent access
    What do you mean by that Corion? This is the first time I heard about this term.
    You are saying Perl can't achieve that also?

    Regards,
    Edward

      By concurrent access I mean two or more programs writing to the file virtually at the same time.

      BerkeleyDB knows how to lock and unlock the database so that only one program modifies the database at one time and the file doesn't get corrupted. When BerkeleyDB manages the writing, I don't need to worry about this in my programs - and file locking is easy to get wrong. The only thing I have to worry about is when two programs modify the same value at the same time, but that's something I have to think about anyway.