in reply to Re^4: threads::shared seems to kill performance
in thread threads::shared seems to kill performance
Okay. I found an obscure detail(Section entitled "In-memory Databases And Shared Cache" ) in the sqlite documentation that allows me to work around DBI's brokeness.
This demonstrates it:
#! perl -slw use strict; use threads; use threads::shared; use threads::Q; use Time::HiRes qw[ time ]; use DBI; use constant { CONNECT=> 'dbi:SQLite:dbname=file:memdb?mode=memory&cache=shared', CREATE => 'create table if not exists DB ( ID integer(8),' . join(',', map "F$_ text(15)", 1..9) . ')', INSERT => 'insert into DB ( ID , ' . join( ',', map "F$_", 1..9 ) . ') values (' . '?,' x 9 . '?) +', INDEX => 'create index if not exists I1 on DB ( ID )', QUERY => 'select * from DB where ID = ?', }; sub thread { my $tid = threads->tid; my( $Q ) = @_; $Q->dq; ## Wait for DB my $dbh = DBI->connect( CONNECT, '', '' ) or die DBI::errstr; my $sth = $dbh->prepare( QUERY ) or die DBI->errstr; while( my $id = $Q->dq ) { $sth->execute( $id ) or die DBI::errstr; my $r = $sth->fetch or warn( "No data for $id" ) and next; ## do something with record. printf "[$tid] %5u %s %s %s %s %s %s %s %s %s\n", @{ $r }; } $sth->finish; $dbh->disconnect; } my @chars = ( 'a'..'z' ); sub dummy { my $n = shift; join '', @chars[ map int( rand @chars ), 1 .. $n ]; } our $T //= 4; our $N //= 100; my $dbh = DBI->connect( CONNECT, '', '', { AutoCommit =>0 } ) or die D +BI::errstr; $dbh->do( 'PRAGMA synchronous = off' ); $dbh->do( 'PRAGMA cache_size = 800000' ); $dbh->do( CREATE ) or die DBI::errstr; my $ins = $dbh->prepare( INSERT ) or die DBI->errstr; for my $n ( 1 .. $N ) { my @fields = ( $n, map dummy( 15 ), 1 .. 9 ); $ins->execute( @fields )or die $ins->errstr; $n %100 or $dbh->commit } $ins->finish; $dbh->commit; #$dbh->do( INDEX ) or die DBI::errstr; my $sth = $dbh->prepare( QUERY ) or die DBI->errstr; for my $id ( 1 .. 5 ) { $sth->execute( $id ) or die DBI::errstr; my $r = $sth->fetch() or warn( "No data for $id" ) and next; ## do something with record. printf "[main] %5u %s %s %s %s %s %s %s %s %s\n", @{ $r }; } $sth->finish; print $dbh->selectrow_array( 'SELECT count(*) from DB' ); my $Q = threads::Q->new( 10 ); my @threads = map threads->create( \&thread, $Q, shared_clone( $dbh ) +), 1 .. $T; $Q->nq( ('Wakey wakey') x $T ); $Q->nq( $_ ) for 1 .. 20; ## $N; $Q->nq( (undef) x $T ); $_->join for @threads; $dbh->disconnect; unlink 'file'; ## without this, the memory db will persist in a appare +ntly empty file???? ## That might be turned into an advantage????
There is however some weirdness associated with this. (See the comments on the last two lines above).
Basically, the trick to making multiple DBI handles (in different threads) refer to the same in memory database is using a dbname of the form:
file:memdb?mode=memory&cache=shared
The weirdness is that although this is a "memory db", a file (in this example called 'file' in the local directory) is created. It will always be 0 bytes and running wc on it confirms it has no content.
But if you delete that empty file, the records from the first run disappear!
Not sure what to do about that other than just delete the file.
|
|---|