in reply to Re^5: DBIx::Class and Parallel SQLite Transactions
in thread DBIx::Class and Parallel SQLite Transactions

I'm well into Kreibich's book already. While it is a little dated and focused on C, there IS a lot of good stuff in it when it comes to how SQLite works.

Insofar as the locking is concerned, if I read Kreibich correctly, the exclusive lock is not requested until the commit is initiated. So other processes can write to the SQLite cache at the same time, but only one can commit at a time. Am I reading this wrong? Sounds like my reading topic for the bus ride home this afternoon :)

Insofar as whether there is an advantage for multiple threads is concerned, my application does a lot of inserts the first time through and afterwards updates most if not all of the same records. So my thoughts are that since there is a fair amounts of reads to identify whether to update or insert that there would be advantages to some degree of parallelization, maybe two threads versus just one.

This spurs my thought of having a two thread write procedure. The first thread would identify what records are updates and which are insert. This would be queued for the second process so that it is writing continuously. I'll have to give this a test also.

Thanks!

lbe

  • Comment on Re^6: DBIx::Class and Parallel SQLite Transactions

Replies are listed 'Best First'.
Re^7: DBIx::Class and Parallel SQLite Transactions
by Marshall (Canon) on Jul 14, 2011 at 21:57 UTC
    I am also in a "learn mode" about the locking details. One important thing: When using the BEGIN DEFERRED transaction (the default), deadlocks are possible. A deadlock is not possible when using BEGIN IMMEDIATE transaction. On Page 154, Chapter 7, paragraph 3:
    "A BEGIN IMMEDIATE transaction can be started while other connections are reading from the database. Once started, no new writers will be allowed, but read-only connections can continue to access the database up until the point that the immediate transaction is forced to modify the database file. This is normally when the transaction is committed."
    There is some more explanation on Page 155, "When Busy becomes Blocked". So a BEGIN IMMEDIATE transaction means: I am saying that this transaction is going to do a write and I want the DB to go into read_only mode. If I don't get a "busy", that's what happens (DB is now read_only until I finish my transaction). My changes are held in the memory cache until I say COMMIT (a cache write is not a "real" write to the disk). When I say COMMIT, first, the database will not allow any new read transactions to start. Then second, the DB will wait for all other transactions to finish (they are all read transactions). Once that happens, my writes can occur because I can have exclusive access to the DB.

    I don't understand what happens if there is a mix of IMMEDIATE and DEFERRED transactions that want to do writes.

    One thing to play around with is the cache_size. This can be adjusted dynamically. The default is pretty small. Some tweaking could perhaps can some performance increase. When I index my DB, I run it up to 200MB and it cuts the index time by like 60%.