in reply to BerkeleyDB vs. Linux file system

perrin is using the BerkeleyDB incorrectly and its showing up slower than it should - replace the ->STORE and ->FETCH methods with ->db_put and ->db_get. Or at least if you're going to use the OO interface use it correctly. This usage is halfway between the tied interface and the OO interface. I think to be more meaningful that the benchmark should have picked a style and stuck with it (OO of course since that's how I use it *smirk*).

Some other people noted that they'd much prefer that you read from the FH handle using read() instead of readline().

Replies are listed 'Best First'.
Re^2: BerkeleyDB vs. Linux file system
by diotalevi (Canon) on Mar 18, 2003 at 17:19 UTC

    Results using the slow functions

    Benchmark: timing 10 iterations of berkeley write, file write... berkeley write: 132 wallclock secs (32.46 usr + 20.78 sys = 53.24 CPU) + @ 0.19/s (n=10) file write: 61 wallclock secs (24.18 usr + 12.72 sys = 36.90 CPU) @ 0 +.27/s (n=1 0) s/iter berkeley write file write berkeley write 5.32 -- -31% file write 3.69 44% -- Benchmark: timing 10 iterations of berkeley read, file read... berkeley read: 179 wallclock secs (11.48 usr + 7.47 sys = 18.95 CPU) +@ 0.53/s (n=10) file read: 225 wallclock secs ( 7.72 usr + 6.81 sys = 14.53 CPU) @ +0.69/s (n= 10) s/iter berkeley read file read berkeley read 1.89 -- -23% file read 1.45 30% --

    Results using the faster functions. This shows a very nice boost to file read and a modest boost to BerkeleyDB read and write.

    Benchmark: timing 10 iterations of berkeley write, file write... berkeley write: 96 wallclock secs (25.64 usr + 21.62 sys = 47.26 CPU) +@ 0.21/s (n=10) file write: 58 wallclock secs (23.88 usr + 13.41 sys = 37.29 CPU) @ 0 +.27/s (n=1 0) s/iter berkeley write file write berkeley write 4.73 -- -21% file write 3.73 27% -- Benchmark: timing 10 iterations of berkeley read, file read... berkeley read: 163 wallclock secs (10.58 usr + 7.83 sys = 18.41 CPU) +@ 0.54/s (n=10) file read: 135 wallclock secs ( 8.21 usr + 6.12 sys = 14.33 CPU) @ +0.70/s (n= 10) s/iter berkeley read file read berkeley read 1.84 -- -22% file read 1.43 28% --

    My alteration to perrin's benchmark

    --- perrin-bench.pl Tue Mar 18 11:41:42 2003 +++ perrin-bench2.pl Tue Mar 18 12:00:06 2003 @@ -28,9 +28,9 @@ sub read_file { my $key = shift; my $file = "$file_dir/$key"; + my $value; open(FH, '<', $file) or die $!; - local $/; - my $value = <FH>; + read FH, $value, (stat FH)[7]; close FH; return $value; } @@ -52,20 +52,22 @@ 'berkeley write' => sub { for (0..1000) { - $db_obj->STORE($_, $_ x 8000); + $db_obj->db_put($_, $_ x 8000); } }, }); cmpthese(10, { 'file read' => sub { + my $test; for (0..1000) { - read_file($_); + $test = read_file($_); } }, 'berkeley read' => sub { + my $test; for (0..1000) { - $db_obj->FETCH($_); + $db_obj->db_get($_,$test); } }, });
      Interesting, your results are just the opposite of mine! Berkeley is slower at writing and faster at reading (before the switch to read) in yours. Must be OpenBSD.

      I'd like to see if sysread/syswrite make much of a difference too. I'll try that later on Linux.

        I rather suspect its the difference in strategies for our file systems. OpenBSD uses ffs and I've enabled soft updates as well (normal and common). The filesystem is normally in sync mode and IIRC soft updates relaxes that somewhat for the data while continuing to be strict about meta-data. Again, IIRC Linux's ext2/3 operates in async mode by default which is faster but is less able to handle interruptions.

        $ perl perrin-bench3.pl Benchmark: timing 10 iterations of berkeley write, file print, file sy +swrite, file write... berkeley write: 96 wallclock secs (25.02 usr + 21.67 sys = 46.69 CPU) +@ 0.21/s (n=10) file print: 54 wallclock secs (23.83 usr + 12.98 sys = 36.81 CPU) @ 0 +.27/s (n=10) file syswrite: 44 wallclock secs (23.17 usr + 11.72 sys = 34.89 CPU) @ + 0.29/s (n=10) file write: 54 wallclock secs (23.65 usr + 12.34 sys = 35.99 CPU) @ 0 +.28/s (n=10) s/iter berkeley write file print file write fil +e syswrite berkeley write 4.67 -- -21% -23% + -25% file print 3.68 27% -- -2% + -5% file write 3.60 30% 2% -- + -3% file syswrite 3.49 34% 6% 3% + -- Benchmark: timing 10 iterations of berkeley read, file read, file slur +p, file sysread... berkeley read: 170 wallclock secs (10.95 usr + 7.75 sys = 18.70 CPU) +@ 0.53/s (n=10) file read: 156 wallclock secs ( 7.68 usr + 6.01 sys = 13.69 CPU) @ 0 +.73/s (n=10) file slurp: 173 wallclock secs (11.71 usr + 6.62 sys = 18.33 CPU) @ +0.55/s (n=10) file sysread: 194 wallclock secs ( 6.80 usr + 5.45 sys = 12.25 CPU) @ + 0.82/s (n=10) s/iter berkeley read file slurp file read file s +ysread berkeley read 1.87 -- -2% -27% + -34% file slurp 1.83 2% -- -25% + -33% file read 1.37 37% 34% -- + -11% file sysread 1.23 53% 50% 12% + -- --- perrin-bench2.pl Tue Mar 18 12:00:06 2003 +++ perrin-bench3.pl Tue Mar 18 14:00:21 2003 @@ -35,6 +35,34 @@ return $value; } +sub slurp_file { + my $key = shift; + my $file = "$file_dir/$key"; + local $/; + open(FH, '<', $file) or die $!; + my $value = <FH>; + close FH; + return $value; +} + +sub sysread_file { + my $key = shift; + my $file = "$file_dir/$key"; + my $value; + open(FH, '<', $file) or die $!; + sysread FH, $value, (stat FH)[7]; + close FH; + return $value; +} + +sub print_file { + my ($key, $value) = @_; + my $file = "$file_dir/$key"; + open(FH, '>', $file) or die $!; + print FH $value; + close FH; +} + sub write_file { my ($key, $value) = @_; my $file = "$file_dir/$key"; @@ -43,13 +71,30 @@ close FH; } +sub syswrite_file { + my ($key, $value) = @_; + my $file = "$file_dir/$key"; + open(FH, '>', $file) or die $!; + print FH $value; + close FH; +} + cmpthese(10, { 'file write' => sub { for (0..1000) { write_file($_, $_ x 8000); } }, - + 'file print' => sub { + for (0..1000) { + print_file($_, $_ x 8000); + } + }, + 'file syswrite' => sub { + for (0..1000) { + syswrite_file($_, $_ x 8000); + } + }, 'berkeley write' => sub { for (0..1000) { $db_obj->db_put($_, $_ x 8000); @@ -62,6 +107,18 @@ my $test; for (0..1000) { $test = read_file($_); + } + }, + 'file slurp' => sub { + my $test; + for (0..1000) { + $test = slurp_file($_); + } + }, + 'file sysread' => sub { + my $test; + for (0..1000) { + $test = sysread_file($_); } }, 'berkeley read' => sub {
Re: Re: BerkeleyDB vs. Linux file system
by perrin (Chancellor) on Mar 18, 2003 at 17:16 UTC
    Actually, I did try db_get and db_put. There was no significant difference. It does not affect the results.

      I see a difference between FETCH/STORE and db_put/db_get. All this confirms for me is that BerkeleyDB is fast enough. I'm just glad it competes nicely with the file system (which as you said has all sorts of in-kernel advantages). My own system is OpenBSD 3.2 using the GENERIC kernel on a 233 MMX pentium using ATA-100 discs in "PIO mode 4, Ultra-DMA mode 5" (whatever that means).

        I didn't say there was no difference, only that it wasn't significant. BerkeleyDB is definitely very fast, and would be a much better choice than a file system for anything small (~50 bytes).

        Did OpenBSD produce different results, or was it about the same, i.e. Berkeley writes faster and reads slower?