Hi.

Well as I stated earlier I had serious reservations about the validity of your benchmark. When I messaged you about it you replied "The point was to use all the modules' simplest interfaces". This is a specious argument. There is no difference between the interface for a DB_HASH or a DB_BTREE. The difference is underneath at the implementation level. And the difference is huge.

I reran your benchmark without CDB_File, it wont build cleanly on W32, and produced the following results which incidentally do not completely support my earlier coments ( the file size differences are not what I said they would be.) However they do show that your comparison of DB_HASH against DBD::SQLite is misleading (SQLite gets _creamed_ by a $DB_TREE), and they also show that if you are looking for high speed with a BerkeleyDB then you should definately go with a $DB_BTREE instead of a $DB_HASH.

My results were as follows:

Benchmark: timing 30000 iterations of 
	I B+Tree, I B+Tree (A), I Hash, I Hash (A), I SQLite, I SQLite 2...
    I B+Tree:  1 wallclock secs ( 0.88 usr +  0.11 sys =  0.98 CPU) @ 30487.80/s (n=30000)
I B+Tree (A):  1 wallclock secs ( 0.95 usr +  0.02 sys =  0.97 CPU) @ 30959.75/s (n=30000)
      I Hash:  8 wallclock secs ( 2.25 usr +  4.38 sys =  6.63 CPU) @ 4528.30/s  (n=30000)
  I Hash (A):  7 wallclock secs ( 2.31 usr +  4.58 sys =  6.89 CPU) @ 4353.50/s  (n=30000)
    I SQLite: 76 wallclock secs (14.84 usr + 49.66 sys = 64.50 CPU) @ 465.12/s   (n=30000)
  I SQLite 2:  4 wallclock secs ( 3.30 usr +  0.09 sys =  3.39 CPU) @ 8849.56/s  (n=30000)
  
                Rate I SQLite I Hash (A) I Hash I SQLite 2 I B+Tree I B+Tree (A)
I SQLite       465/s       --       -89%   -90%       -95%     -98%         -98%
I Hash (A)    4354/s     836%         --    -4%       -51%     -86%         -86%
I Hash        4528/s     874%         4%     --       -49%     -85%         -85%
I SQLite 2    8850/s    1803%       103%    95%         --     -71%         -71%
I B+Tree     30488/s    6455%       600%   573%       245%       --          -2%
I B+Tree (A) 30960/s    6556%       611%   584%       250%       2%           --



Benchmark: timing 30000 iterations of 
	S B+Tree, S B+Tree(A), S Hash, S Hash(A), S SQLite, S SQLite 2...
   S B+Tree:  1 wallclock secs ( 0.77 usr +  0.09 sys =  0.86 CPU) @ 34965.03/s (n=30000)
S B+Tree(A):  1 wallclock secs ( 0.95 usr +  0.05 sys =  1.00 CPU) @ 30000.00/s (n=30000)
     S Hash:  2 wallclock secs ( 1.03 usr +  1.63 sys =  2.66 CPU) @ 11295.18/s (n=30000)
  S Hash(A):  3 wallclock secs ( 1.47 usr +  1.36 sys =  2.83 CPU) @ 10608.20/s (n=30000)
   S SQLite: 11 wallclock secs ( 6.80 usr +  3.69 sys = 10.49 CPU) @ 2861.23/s  (n=30000)
 S SQLite 2:  4 wallclock secs ( 3.89 usr +  0.09 sys =  3.98 CPU) @ 7530.12/s  (n=30000)
 
               Rate S SQLite S SQLite 2 S Hash(A)    S Hash S B+Tree(A) S B+Tree
S SQLite     2861/s       --       -62%      -73%      -75%        -90%     -92%
S SQLite 2   7530/s     163%         --      -29%      -33%        -75%     -78%
S Hash(A)   10608/s     271%        41%        --       -6%        -65%     -70%
S Hash      11295/s     295%        50%        6%        --        -62%     -68%
S B+Tree(A) 30000/s     948%       298%      183%      166%          --     -14%
S B+Tree    34965/s    1122%       364%      230%      210%         17%       --

Resulting File Sizes:
bdb.hash.test :5226496
bdb.btree.test :6701056
bdb.hash.a.test :5095424
bdb.btree.a.test :3579904
sqlite.test :3845120
sqlite2.test :2043904
And the code I used to produce this benchmark is
use DB_File; use DBI; use Benchmark qw(cmpthese); use strict; use warnings; # do some setup my $id_BDB_hash = 0; my $id_BDB_hash_a = 0; my $id_BDB_btree_a = 0; my $id_BDB_btree = 0; my $id_lit2 = 0; my $id_lit = 0; my $txt = 'a' x 100; unlink( "bdb.hash.test", "bdb.btree.test", "bdb.hash.a.test", "bdb.btree.a.test", "sqlite.test", "sqlite2.test" ); my $dbh = DBI->connect( 'dbi:SQLite:dbname=sqlite.test', '', '', { AutoCommi +t => 1, RaiseError => 1 } ); my $dbh2 = DBI->connect( 'dbi:SQLite:dbname=sqlite2.test', '', '', { AutoComm +it => 0, RaiseError => 1 } ); $dbh->do('CREATE TABLE test (id integer PRIMARY KEY, foo varchar(100)) +'); $dbh2->do('CREATE TABLE test (id integer PRIMARY KEY, foo varchar(100) +)'); $dbh->do('PRAGMA default_synchronous = off'); my $sth_ins = $dbh->prepare('INSERT INTO test (id, foo) values (?, ?) +'); my $sth_sel = $dbh->prepare('SELECT foo FROM test where id=?'); my $sth2_ins = $dbh2->prepare('INSERT INTO test (id, foo) values (?, ? +)'); my $sth2_sel = $dbh2->prepare('SELECT foo FROM test where id=?'); tie my %bdb_btree_a, "DB_File", "bdb.btree.a.test", O_RDWR | O_CREAT, +0640, $DB_BTREE or die "Cannot open file 'bdb.btree.a.test': $!\n"; tied(%bdb_btree_a)->filter_fetch_key( sub { $_ = unpack( "N", $_ || 0 +) } ); tied(%bdb_btree_a)->filter_store_key( sub { $_ = pack( "N", $_ || 0 ) +} ); tie my %bdb_hash_a, "DB_File", "bdb.hash.a.test", O_RDWR | O_CREAT, 06 +40, $DB_HASH or die "Cannot open file 'bdb.hash.a.test': $!\n"; tied(%bdb_hash_a)->filter_fetch_key( sub { $_ = unpack( "N", $_ || 0 ) + } ); tied(%bdb_hash_a)->filter_store_key( sub { $_ = pack( "N", $_ || 0 ) } + ); tie my %bdb_hash, "DB_File", "bdb.hash.test", O_RDWR | O_CREAT, 0640, +$DB_HASH or die "Cannot open file 'bdb.hash.test': $!\n"; tie my %bdb_btree, "DB_File", "bdb.btree.test", O_RDWR | O_CREAT, 0640 +, $DB_BTREE or die "Cannot open file 'bdb.btree.test': $!\n"; sub insertLite { $sth_ins->execute( $id_lit++, $txt ); } sub insertLite2 { $sth2_ins->execute( $id_lit2++, $txt ); } sub insertBDB_btree_a { $bdb_btree_a{ $id_BDB_btree_a++ } = $txt; } sub insertBDB_hash_a { $bdb_hash_a{ $id_BDB_hash_a++ } = $txt; } sub insertBDB_btree { $bdb_btree{ $id_BDB_btree++ } = $txt; } sub insertBDB_hash { $bdb_hash{ $id_BDB_hash++ } = $txt; } sub selectLite { $sth_sel->execute( $id_lit++ ); 1 while $sth_sel->fetch; } sub selectLite2 { $sth2_sel->execute( $id_lit2++ ); 1 while $sth2_sel->fetch; } sub selectBDB_btree_a { $bdb_btree_a{ $id_BDB_btree_a++ }; } sub selectBDB_hash_a { $bdb_hash_a{ $id_BDB_hash_a++ }; } sub selectBDB_btree { $bdb_btree{ $id_BDB_btree++ }; } sub selectBDB_hash { $bdb_hash{ $id_BDB_hash++ }; } cmpthese( 30_000, { 'I SQLite' => \&insertLite, 'I SQLite 2' => \&insertLite2, 'I Hash' => \&insertBDB_hash, 'I B+Tree' => \&insertBDB_btree, 'I Hash (A)' => \&insertBDB_hash_a, 'I B+Tree (A)' => \&insertBDB_btree_a, }, ); $id_BDB_hash = 0; $id_BDB_hash_a = 0; $id_BDB_btree_a = 0; $id_BDB_btree = 0; $id_lit2 = 0; $id_lit = 0; cmpthese( 30_000, { 'S SQLite' => \&selectLite, 'S SQLite 2' => \&selectLite2, 'S Hash' => \&selectBDB_hash, 'S B+Tree' => \&selectBDB_btree, 'S Hash(A)' => \&selectBDB_hash_a, 'S B+Tree(A)' => \&selectBDB_btree_a, }, ); foreach my $file ( "bdb.hash.test", "bdb.btree.test", "bdb.hash.a.test", "bdb.btree.a +.test", "sqlite.test", "sqlite2.test" ) { print "$file :" . ( -s $file ) . "\n"; }
As Matts said, please no comments about the benchmarking code, I know its could be a lot nicer. (Its only a benchmark... :-)

Yves / DeMerphq
--
Benchmarking things isn't as easy or as straightforward as many people seem to think....


In reply to New Benchmark: SQLite vs CDB_File vs BerkeleyDB by demerphq
in thread SQLite vs CDB_File vs BerkeleyDB by Matts

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.