Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

O Learned Ones:

I have inherited a rather large chunk of Perl code which makes extensive use of BerkeleyDB (via the tied hash interface). I would like to see what's the access pattern of the databases (a large number of which are used). Is there a hook I can add in a central place which will let me log all accesses (fetches, reads, writes)?

Much appreciate your wisdom on this. :)

Replies are listed 'Best First'.
Re: Profiling BerkeleyDB?
by madbombX (Hermit) on Nov 03, 2006 at 02:20 UTC
    I believe that you need to provide us with more information as it sounds like you have a bunch of code for a custom rolled application. However, generically speaking you can do a few things. You can define the calling methods to include a logging function. Ie every time a db_get() is called, have it run the logger function and then do the db_get(). This can also mean that if the read calls are always called via the same method (assuming its OO) or via the same function, you can add a logging hook to the function. Or if you are feeling really frisky, you can always edit the BerkDB module (assuming that is the one you are using to access the tied hash interface). Although it is generally not recommended to mess with CPAN modules, this is one of your options (since you are doing it locally). You may also want to read up on "Overriding Built-in Functions" in perlsub. Again the method you choose depends on the way the current code that has been dropped in your lap was written.

    You can also just profile the code using a regular profiler. Here is a thread that talks a little about profiling: Get Execution time.

      This was my first thought actually. I have done it for a different module. But it doesn't seem to work on BerkeleyDB! I modified the "FETCH" and "EXISTS" subs to log stuff, but it didn't do anything. I'm thinking it might be something to do with XsLoader and the fact that BerkeleyDB is a binary module.
Re: Profiling BerkeleyDB?
by BerntB (Deacon) on Nov 03, 2006 at 05:14 UTC
    Please don't use this against me. Check "perldoc -l". You can save a copy of a module, modify it to log someplace -- and then copy it back after everything is done.

    Disclaimer: It is stupid. If you have to do it, use a local machine and don't blame me if it shoots you in the foot with a grenade rifle. If you do this on a server used by others, they will burn you in effigy (if you're lucky).

    You might, as a start, check something I've really seen:

    $bd{"name_$id"} = $name; $bd{"adr_$id"} = $address; etc;

    I.e. it does a db access for every field! (It was the first job after uni for the guilty party and a long time ago, so I won't name him. No, not me. :-)

    Update: madbombX had already written what I wrote, but better. Must have missed it at first reading. Sorry for wasting people's time. (To really insert a layer for a calling module, you could write a layer module and use that instead. The layer method use the real db module and implments the API you want to log -- use AUTOLOAD for the rest of the API. There must be multiple modules on cpan which already do this for OO and functional interfaces.)

Re: Profiling BerkeleyDB?
by Anonymous Monk on Nov 03, 2006 at 09:25 UTC
    No, but you can turn on logging, and subsequently examine/analyze the logs.
      I have read through the BerkeleyDB documentation, but couldn't find this logging option. Can you tell me how to do it? It might be the solution I need.
      Thanks!
        Hmm
        $ perldoc BerkeleyDB |grep -i log If you don't intend using transactions, locking or logging, then y +ou DB_DATA_DIR, DB_LOG_DIR and DB_TMP_DIR DB_LOG_DIR => "/home/logs", by Berkeley DB will be logged to this file. A useful debug se +tting DB_INIT_LOG Initialise the Logging sub-system.
Re: Profiling BerkeleyDB?
by Anonymous Monk on Nov 03, 2006 at 09:00 UTC
    BerkeleyDB makes extensive use of XsLoader and binary files.
    I tried adding a print statement to the "FETCH" sub (and the "EXISTS" sub), but they print nothing. I'm guessing these calls are handled directly by the BerkeleyDB binaries (.so files).