in reply to Re^3: DBIx::Class and Parallel SQLite Transactions
in thread DBIx::Class and Parallel SQLite Transactions

I'm certainly no expert here. My belief that DBIx::Class with SQLite is threadsafe is based upon not just the CPAN documentation that you referenced but also several threads that I have read from the DBIx::Class developers. This thread from GrokBase seems to be the most complete, at least that I remember, : [Dbix-class ] "fork- and thread-safe"

Things certainly seem to behave properly when reading from SQLite; however, I can't say the same for writing to SQLite.

But as I previously mentioned, this appears to be an issue with how the lock state of the database is handled. I haven't tested this across separate autonomous processes, though while typing this note I thought of a way I can probably test this without too much work. I'll give it a try this evening when I have a little time.

  • Comment on Re^4: DBIx::Class and Parallel SQLite Transactions

Replies are listed 'Best First'.
Re^5: DBIx::Class and Parallel SQLite Transactions
by Marshall (Canon) on Jul 14, 2011 at 16:20 UTC
    But as I previously mentioned, this appears to be an issue with how the lock state of the database is handled.

    The write performance of SQLite will be far less than say MySQL. SQLite needs to gain an exclusive lock on the entire DB file in order to do a write. There is no table level or row level locking. So I figure that having multiple writers will bring nothing but trouble to you. Since in a transaction your rows are exclusive (don't overlap with other transactions), then a DB that can do row level locking could yield a lot higher performance.

    I am working on a SQLite project and have found the O'Reilly book, "Using SQLite" by Jay A. Kreibich to be helpful. It is mainly oriented around the C interface, but there is plenty of great info for the Perl users too. There are discussions about how and why the DB can be busy - it is locking related.

    Update: Oh, I'm not sure how well the DBI does with threads. The implementation may be "safe", but not high performance, i.e. it may just wind up serializing things. With MySQL, you may get higher performance with a process per writer instead of a thread. Benchmarks will tell.

      I'm well into Kreibich's book already. While it is a little dated and focused on C, there IS a lot of good stuff in it when it comes to how SQLite works.

      Insofar as the locking is concerned, if I read Kreibich correctly, the exclusive lock is not requested until the commit is initiated. So other processes can write to the SQLite cache at the same time, but only one can commit at a time. Am I reading this wrong? Sounds like my reading topic for the bus ride home this afternoon :)

      Insofar as whether there is an advantage for multiple threads is concerned, my application does a lot of inserts the first time through and afterwards updates most if not all of the same records. So my thoughts are that since there is a fair amounts of reads to identify whether to update or insert that there would be advantages to some degree of parallelization, maybe two threads versus just one.

      This spurs my thought of having a two thread write procedure. The first thread would identify what records are updates and which are insert. This would be queued for the second process so that it is writing continuously. I'll have to give this a test also.

      Thanks!

      lbe

        I am also in a "learn mode" about the locking details. One important thing: When using the BEGIN DEFERRED transaction (the default), deadlocks are possible. A deadlock is not possible when using BEGIN IMMEDIATE transaction. On Page 154, Chapter 7, paragraph 3:
        "A BEGIN IMMEDIATE transaction can be started while other connections are reading from the database. Once started, no new writers will be allowed, but read-only connections can continue to access the database up until the point that the immediate transaction is forced to modify the database file. This is normally when the transaction is committed."
        There is some more explanation on Page 155, "When Busy becomes Blocked". So a BEGIN IMMEDIATE transaction means: I am saying that this transaction is going to do a write and I want the DB to go into read_only mode. If I don't get a "busy", that's what happens (DB is now read_only until I finish my transaction). My changes are held in the memory cache until I say COMMIT (a cache write is not a "real" write to the disk). When I say COMMIT, first, the database will not allow any new read transactions to start. Then second, the DB will wait for all other transactions to finish (they are all read transactions). Once that happens, my writes can occur because I can have exclusive access to the DB.

        I don't understand what happens if there is a mix of IMMEDIATE and DEFERRED transactions that want to do writes.

        One thing to play around with is the cache_size. This can be adjusted dynamically. The default is pretty small. Some tweaking could perhaps can some performance increase. When I index my DB, I run it up to 200MB and it cuts the index time by like 60%.

Re^5: DBIx::Class and Parallel SQLite Transactions
by Corion (Patriarch) on Jul 14, 2011 at 13:41 UTC

    I see no mention of threads and forking in the DBD::SQLite documentation. Grepping the source also lists no mention of threads in the SQLite.xs file, but only within sqlite3.c itself. This indicates to me that little thought has been given to how to make DBD::SQLite and threads/fork play nice together.

    You can maybe check whether the problem is inherent to fork() and/or threads by launching multiple writes as real, separate processes via exec resp. system instead of fork/threads. If you still experience the segfaults/crashes, the problem likely is with your version of DBD::SQLite or some other XS code loaded in the separate process. If the problem goes away, then the problem likely is related to fork/threads and I see no other way than to change to a different DBD if you want to keep going with fork/threads.

    You haven't told us your OS so far, but if you are on Windows, threads and fork() are basically the same thing there anyway, and all problems with one system are present in the other as well.

      I'll take a shot a testing this evening.

      Sorry for forgetting the obvious. My primary environment is currently Fedora 13

      
      uname -a
      2.6.18-194.26.1.el5.028stab079.2 #1 SMP Fri Dec 17 19:25:15 MSK 2010 x86_64 x86_64 x86_64 GNU/Linux.

      My perl version info is:

      
      Summary of my perl5 (revision 5 version 10 subversion 1) configuration:
      
        Platform:
          osname=linux, osvers=2.6.32-71.14.1.el6.x86_64, archname=x86_64-linux-thread-multi
          uname='linux x86-06.phx2.fedoraproject.org 2.6.32-71.14.1.el6.x86_64 #1 smp wed jan 5 17:01:01 est 2011 x86_64 x86_64 x86_64 gnulinux '
          config_args='-des -Doptimize=-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic -Dccdlflags=-Wl,--enable-new-dtags -DDEBUGGING=-g -Dversion=5.10.1 -Dmyhostname=localhost -Dperladmin=root@localhost -Dcc=gcc -Dcf_by=Red Hat, Inc. -Dprefix=/usr -Dvendorprefix=/usr -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl5 -Dsitearch=/usr/local/lib64/perl5 -Dprivlib=/usr/share/perl5 -Dvendorlib=/usr/share/perl5 -Darchlib=/usr/lib64/perl5 -Dvendorarch=/usr/lib64/perl5 -Dinc_version_list=5.10.0 -Darchname=x86_64-linux-thread-multi -Dlibpth=/usr/local/lib64 /lib64 /usr/lib64 -Duseshrplib -Dusethreads -Duseithreads -Duselargefiles -Dd_dosuid -Dd_semctl_semun -Di_db -Ui_ndbm -Di_gdbm -Di_shadow -Di_syslog -Dman3ext=3pm -Duseperlio -Dinstallusrbinperl=n -Ubincompat5005 -Uversiononly -Dpager=/usr/bin/less -isr -Dd_gethostent_r_proto -Ud_endhostent_r_proto -Ud_sethostent_r_proto -Ud_endprotoent_r_proto -Ud_setprotoent_r_proto -Ud_endservent_r_proto -Ud_setservent_r_proto -Dscriptdir=/usr/bin -Dotherlibdirs=/usr/local/lib64/perl5/site_perl/5.10.0/x86_64-linux-thread-multi:/usr/local/lib/perl5/site_perl/5.10.0:/usr/lib64/perl5/vendor_perl/5.10.0/x86_64-linux-thread-multi:/usr/lib/perl5/vendor_perl:/usr/lib/perl5/site_perl'
          hint=recommended, useposix=true, d_sigaction=define
          useithreads=define, usemultiplicity=define
          useperlio=define, d_sfio=undef, uselargefiles=define, usesocks=undef
          use64bitint=define, use64bitall=define, uselongdouble=undef
          usemymalloc=n, bincompat5005=undef
        Compiler:
          cc='gcc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
          optimize='-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic',
          cppflags='-D_REENTRANT -D_GNU_SOURCE -fno-strict-aliasing -pipe -fstack-protector -I/usr/local/include'
          ccversion='', gccversion='4.4.5 20101112 (Red Hat 4.4.5-2)', gccosandvers=''
          intsize=4, longsize=8, ptrsize=8, doublesize=8, byteorder=12345678
          d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
          ivtype='long', ivsize=8, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
          alignbytes=8, prototype=define
        Linker and Libraries:
          ld='gcc', ldflags =' -fstack-protector'
          libpth=/usr/local/lib64 /lib64 /usr/lib64
          libs=-lresolv -lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lpthread -lc
          perllibs=-lresolv -lnsl -ldl -lm -lcrypt -lutil -lpthread -lc
          libc=, so=so, useshrplib=true, libperl=libperl.so
          gnulibc_version='2.12.2'
        Dynamic Linking:
          dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-Wl,--enable-new-dtags -Wl,-rpath,/usr/lib64/perl5/CORE'
          cccdlflags='-fPIC', lddlflags='-shared -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector --param=ssp-buffer-size=4 -m64 -mtune=generic'
      
      
      Characteristics of this binary (from libperl):
        Compile-time options: MULTIPLICITY PERL_DONT_CREATE_GVSV
                              PERL_IMPLICIT_CONTEXT PERL_MALLOC_WRAP USE_64_BIT_ALL
                              USE_64_BIT_INT USE_ITHREADS USE_LARGE_FILES
                              USE_PERLIO USE_REENTRANT_API
        Built under linux
        Compiled at Apr  5 2011 08:00:50
        %ENV:
          PERL_MM_USE_DEFAULT="1"
        @INC:
          /usr/local/lib64/perl5
          /usr/local/share/perl5
          /usr/lib64/perl5
          /usr/share/perl5
          /usr/lib64/perl5
          /usr/share/perl5
          /usr/local/lib64/perl5/site_perl/5.10.0/x86_64-linux-thread-multi
          /usr/local/lib/perl5/site_perl/5.10.0
          /usr/lib64/perl5/vendor_perl/5.10.0/x86_64-linux-thread-multi
          /usr/lib/perl5/vendor_perl/5.10.0
          /usr/lib/perl5/vendor_perl
          /usr/lib/perl5/site_perl