skinnymofo has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,
  Any wisdom on DMB files becoming corrupted? I'm running ActiveState Perl on WinNT Server 4.0 SP 6. The DBM module Perl is using is DB_File.pm. The program ran fine until recently.
  I've looked through PerlMonks and noticed people have mentioned limits to the file size on certain implementations. The DBM file in question is a smidge over 82,000 KB.

Many humble thanks,
Skinnymofo

Replies are listed 'Best First'.
Re: DBM file corruption?
by dws (Chancellor) on Jan 15, 2002 at 08:10 UTC
    Any wisdom on DMB files becoming corrupted?

    DBM is not serially-reentrant*. If you have more than one process trying to write, you risk corruption unless you serialize access by implementing a locking scheme.

    I don't have any info on practical size limits (my DBM database are all much smaller).

    *My info on this is 1.5 years stale. There might have been some improvements since then.

(crazyinsomniac) Re: DBM file corruption?
by crazyinsomniac (Prior) on Jan 15, 2002 at 09:08 UTC
    That is too little information skinnymofo. DB_File comes in a variety of *versions*, and depending on which Berkley DB you have installed, yet another set of *versions*. Also, are you using DB_RECNO, DB_BTREE, or DB_HASH?

    Are you sharing the "database" with a C application (or some non-perl non-DB_File thing)?

    I use DB_File all the time, and I'd really like to know exactly what you're doing?

    Is the "database" being shared between perl applications (race conditions)?

    Are you locking the "database" somehow?

    Did you perhaps interupt the application whilst it was in the midst of writing to the database?

    The only size limits I've heard of are on keys/values, usually when dealing with DB_RECNO (and bigger files would decrease performance, but i've not heard of corruption).

    DB_File databases, with possibly the exception of DB_RECNO, are *usually* not portable accross systems, or at least not accross versions. Did you perhaps upgrade DB_File?

    The fact that you're running WinNT is not of great help. You need to give us the perl version and more importantly the DB_File version

    update:
    P.S. - I don't like to call DB_File a DBM file, cause it's so much better than the others ;)

    Also, and I can't believe I didn't focus on this first, please define "corruption". What is going wrong?

     
    ______crazyinsomniac_____________________________
    Of all the things I've lost, I miss my mind the most.
    perl -e "$q=$_;map({chr unpack qq;H*;,$_}split(q;;,q*H*));print;$q/$q;"

      CrazyInsomniac and others, thanks for your replies.

        I'm using Perl v5.6.0, DB_File v1.73, and DB_HASH.

        I looked at locking and race conditions in other nodes on PerlMonks, but as the file is not being shared with any other programs I don't think that's the issue.

        Sorry, corruption is rather ambiguous; I mean that at a particular point in the file (the 40,309th key) the value looks like a null to Perl. {Thanks YuckFoo for the hint} But, the step that writes the key/value gives no error, hiccup, or other indication that the write to that particular key went bad.
        One thing I didn't think about, though, you have thankfully brought to light. There's a very good chance that the program got interupted 'whilst' using the file.

      Many Thanks,
      Skinnymofo
Re: DBM file corruption?
by YuckFoo (Abbot) on Jan 15, 2002 at 09:08 UTC
    To start, if the data is text, dump the dbm to a text file and see if keys and data are what you expect, see if they are reasonable.

    YuckFoo

    #!/usr/bin/perl use strict; if (@ARGV < 1) { print STDERR "\nUsage $0 dbfile\n\n"; exit; } my ($dbfile) = @ARGV; my (%DB, $key, $val); if (!dbmopen(%DB, $dbfile, 0444) ) { print STDERR "\nError opening file $dbfile\n\n"; exit; } while ( ($key, $val) = each %DB) { print "$key,$val\n"; } dbmclose(%DB);
      Two issues.

      The obvious one is that often knowing what the exact error is and what line it is on is often very important. So you should use die, without a return in it, and have $! in the die statement. (Just like it says in perlstyle.)

      The second, more subtle, issue is that dbmopen is unsafe. It will assume that your dbm files are stored in the "best" format that your Perl knows how to handle. As files move from machine to machine, or if you install a new module, the "best" format may change, meaning that your code no longer knows how to read the dbm file from disk. It is therefore far better IMNSHO to use tie and be explicit about how you are accessing the dbm file.