in reply to Re^3: Moving from SQL to flat-files
in thread Moving from SQL to flat-files

Just tried out DBM::Deep. Easy to install, create a database, and start working. But there are a few issues which might be because I don't know enough.

Issue 1: Size -- I converted a two-column 230k row table into a DBM::Deep and a DB_File database respectively. The 14 Mb SQLite file became a 25 Mb DBM::Deep file, but shrank to a 5 Mb DB_File file.

Issue 2: Spped -- A simplistic benchmark of counting the number of records in the table gave the following

Benchmark: timing 1 iterations of DB_File, DBM::Deep, SQLite 3... DB_File:2 wallclock secs ( 1.90 usr + 0.15 sys = 2.05 CPU) @ 0.49/s + DBM::Deep:93 wallclock secs (79.24 usr + 9.42 sys = 88.67 CPU) @ 0.0 +1/s SQLite 3:0 wallclock secs ( 0.04 usr + 0.01 sys = 0.05 CPU) @ 19.61 +/s

My code was simple "SELECT COUNT(*) FROM sqlitedb" for the SQLite db, and "return scalar keys(%$db)" for the other two databases. Is this expected (in particular, is the slowness of DBM::Deep expected, or is there a better way to do this query?

--

when small people start casting long shadows, it is time to go to bed

Replies are listed 'Best First'.
Re^5: Moving from SQL to flat-files
by dragonchild (Archbishop) on May 09, 2006 at 20:24 UTC
    DBM::Deep is Pure perl. It also encodes the Perl data structure in it. Neither DB_File or SQLite does that. So, it's obviously going to be quite slower and somewhat bigger.

    Plus, keys() in DBM::Deep is very unoptimized. Most of the code is, right now. That's one reason I took it over.


    My criteria for good software:
    1. Does it work?
    2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?
      Actually, I don't mind bigger that much. I just wanted to confirm if I was doing something wrong, and if there was any other way of getting a count of all the records.

      Once I know all the "gotchas" I can make a decision.

      Many thanks,

      --

      when small people start casting long shadows, it is time to go to bed
        I don't think I was as clear as I needed to be. Try the following code with a hash tied against the various DBMs.
        use Data::Dumper; # %x is the tied hash. $x{foo} = [ 1 .. 3, { a => { b => 'c' } } ]; print Dumper \%x;
        Using DBM::Deep, the following will be printed:
        $VAR1 = { 'foo' => bless( [ '1', '2', '3', bless( { 'a' => bless( { 'b' => 'c' }, 'DBM::Deep::Hash +' ) }, 'DBM::Deep::Hash' ) ], 'DBM::Deep::Array' ) };
        Ignoring the random blessings, the entire data structure as Perl would see it is encoded. Using BerkeleyDB, you see
        $VAR1 = { 'foo' => 'ARRAY(0x18014f4)' };

        Note: this is as designed for BDB - it isn't meant to handle anything as the value other than a simple binary string. DBM::Deep is.


        My criteria for good software:
        1. Does it work?
        2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?