zentara has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I was just looking at DBM::Deep for a script, and tried testing the example code in the perldoc. I noticed that it grows in size with each run, even though the variables are the same. Am I setting this up wrong, or does the hash leak? I get the file growing about 350 bytes at each run.
#!/usr/bin/perl use warnings; use strict; use DBM::Deep; my $db = DBM::Deep->new( "ztest1.db" ); $db->{mykey} = "myvalue"; #if you comment this line, it dosn't gain $db->{myhash} = {}; $db->{myhash}->{subkey} = "subvalue"; print $db->{mykey}, "\n"; print $db->{myhash}->{subkey} , "\n"; foreach my $key (keys %$db) { print "$key: " . $db->{$key} . "\n"; }

I'm not really a human, but I play one on earth. flash japh

Replies are listed 'Best First'.
Re: DBM::Deep dbfile size
by dragonchild (Archbishop) on Mar 09, 2006 at 00:40 UTC
    Heh. I'm actually working on this right now for the 1.00 release. The immediate fix is that you need to periodically run $db->optimize(), kinda like you used to have to vacuum in Postgres.

    For those who care, the problem is in how the filespace is managed. In the 0.x series, there is no concept of freed space. There is just the end of the file. So, when you do $db->{myhash} = {} (or anything that triggers the CLEAR() subroutine), only the index is currently wiped clean. The actual space is still used. Nothing in the 0.x series will actually mark that space as being usable.

    For 1.00, I'm working on a freespace management algorithm that will attempt to reuse as much freespace as possible. So, in this case, it will remain the same size because the area that is now freed will be marked as being usable.

    It will still leak, but the leak will be any contiguous filespace that couldn't hold something anyways. That minimum is currently 9 bytes for an internal reference, which is something like:

    $db->{foo} = {}; $db->{bar} = $db->{foo};
    Without expensive relocations, there's no way to avoid this in any application that's based on a diskfile. Even applications like Oracle, MySQL, and Postgres eventually leak filespace. You just don't notice it.

    My criteria for good software:
    1. Does it work?
    2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
Re: DBM::Deep dbfile size
by zentara (Cardinal) on Mar 08, 2006 at 21:49 UTC
    I'm not sure what the reasons are, but I tried an assortment of ways to clear the hash, = (), = undef, undef it and
    $db->{myhash} = '';
    seems to work fine.

    I'm not really a human, but I play one on earth. flash japh