Update: Added SQLite_File for comparison.

Hello talexb,

The following is a demonstration comparing a plain hash against CDB_File, DB_File. and SQLite_File.

use strict; use warnings; use feature 'state'; use CDB_File; use DB_File; use SQLite_File; use Time::HiRes 'time'; my @points = ( [ 0, 0 ], [ -1, -2 ], [ 1, 2 ], [ -1, 2 ], ... ); sub plain_hash { my %hash; $hash{ join(':',@{$_}) } = 1 for @points; \%hash; } sub cdb_hash { my %hash; # create CDB file my $cdb = CDB_File->new("t.cdb", "t.cdb.$$") or die "$!\n"; $cdb->insert( join(':',@{$_}), 1 ) for @points; $cdb->finish; # use CDB file tie %hash, 'CDB_File', "t.cdb" or die "$!\n"; \%hash; } sub db_hash { my %hash; # create DB file tie %hash, 'DB_File', "t.db", O_CREAT|O_RDWR, $DB_BTREE; $hash{ join(':',@{$_}) } = 1 for @points; untie %hash; # use DB file tie %hash, 'DB_File', "t.db", O_RDWR, $DB_BTREE; \%hash; } sub sql_hash { tie my %hash, 'SQLite_File', "sql.db"; $hash{ join(':',@{$_}) } = 1 for @points; \%hash; } sub look { my $cells = shift; state $points_str = [ map { join(':',@{$_}) } @points ]; for my $p (@{ $points_str }) { exists $cells->{$p} or die; } } sub bench { my ( $desc, $func ) = @_; my ( $cells, $iters ) = ( $func->(), 50000 ); my $start = time; look($cells) for 1..$iters; my $elapse = time - $start; printf "%s duration : %0.03f secs\n", $desc, $elapse; printf "%s lookups : %d / sec\n", $desc, @points * $iters / $elapse; } bench( "plain", \&plain_hash ); bench( " cdb", \&cdb_hash ); bench( " db", \&db_hash ); bench( " sql", \&sql_hash );

Results: The native Perl on Mac OS X 10.11.6 is v5.18.2. The hardware is a Haswell i7 chip at 2.6 GHz.

Regarding CDB_File and DB_File performance, these run better with Perl 5.20 and later releases.

$ perl test.pl plain duration : 0.639 secs plain lookups : 10243321 / sec cdb duration : 5.692 secs cdb lookups : 1150711 / sec db duration : 9.746 secs db lookups : 672046 / sec $ /opt/perl-5.26.0/bin/perl test.pl plain duration : 0.517 secs plain lookups : 12677776 / sec cdb duration : 3.881 secs cdb lookups : 1687545 / sec db duration : 6.134 secs db lookups : 1067780 / sec sql duration : 184.338 secs sql lookups : 35532 / sec

Regards, Mario


In reply to Re^2: Fastest way to lookup a point in a set by marioroy
in thread Fastest way to lookup a point in a set by eyepopslikeamosquito

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.