in reply to [SOLVED]: DBD::SQLite "file is encrypted or is not a database" in running application

I've written a Parallel::ForkManager reproduction script in hopes others can review/test and provide feedback. It doesn't involve any convoluted event code.

I created a simple database by running sqlite3 test.db < test.sql where test.sql consists of:

DROP TABLE IF EXISTS test; CREATE TABLE test ( id VARCHAR(50), value VARCHAR(20) ); INSERT INTO test VALUES ('test', 0);

Here is the code. Note that lowering the num of forks, and/or tweaking with sleeping will evade the issue... for a period only. It eventually creeps back in. The code below breaks nearly immediately (on my Raspberry Pi 3):

use warnings; use strict; use DBI; use Parallel::ForkManager; my $dbh = DBI->connect( "dbi:SQLite:dbname=test.db", "", "", {RaiseError => 1} ) or die $DBI::errstr; my $pm = Parallel::ForkManager->new(20); for (1..20){ $pm->start and next; my $interval = $_ / 10; while (1){ dbwrite($dbh, $interval); #print "$interval\n"; } } while(1){ my $sth = $dbh->prepare( "SELECT * FROM test WHERE id=?;" ); $sth->execute('test'); my $val = $sth->fetchrow_hashref()->{value}; #sleep 1; } sub dbwrite { my ($dbh, $interval) = @_; my $sth = $dbh->prepare( 'UPDATE test SET value=? WHERE id="test;"' ); $sth->execute($interval); };

On my more powerful laptop with an SSD disk, it takes ~10 seconds to break. Now, my application doesn't process nearly as fast as this script does. That said, it still isn't good, and I need to find a reliable solution.

Are there any reasonable fixes for this, or should I instead be looking at using a MySQL database on a remote server? DB read/writes in the real app I'm writing happen at most, every second, give or take.

I'd really, really like to avoid using an external DB if possible; I want this app to be self-contained. I can't use a memory db (ie. I don't know HOW to share a memory db between different processes, so any pointers here that I can test will be quite welcome as well).

Thanks for all the feedback Monks.

-stevieb

  • Comment on Re: DBD::SQLite "file is encrypted or is not a database" in running application (repro code)
  • Select or Download Code

Replies are listed 'Best First'.
Re^2: DBD::SQLite "file is encrypted or is not a database" in running application (repro code)
by davido (Cardinal) on Oct 10, 2016 at 15:09 UTC

    See the POD for DBD::SQLite which states:

    You shouldn't (re)use a database handle you created...before you fork(). Otherwise, you might see a database corruption in the worst case. If you need to fork(), (re)open a database after you fork().

    Your sample code seems to be doing exactly what the POD suggests shouldn't be done; sharing a database handle across forked children.

    SQLite's locking relies on the operating system's flock. Sharing a database handle probably shares the underlying filehandles, and at least on my Linux systems calling for an exclusive flock on the same filehandle twice is perfectly OK.

    The manpage for GNU/Linux flock indicates that calling flock more than one time on a file descriptor is ok, and an already locked file will convert an existing lock to the new lock mode. In the case of your sample script this could be even worse, because your forked children are still alive while you enter the 2nd while(1) loop. Inside your 2nd while(1) you are doing SELECT's, which may be converting the database handle's LOCK_EX to a LOCK_SH. ...probably doesn't matter though; each child reusing the same handle is able to quietly obtain a lock, so the previous lock state is irrelevant -- they all stomp on each other without warning.

    Instantiate a new database handle in each child, and the problem should go away.


    Dave

      Thank you so much!

      That apparently was the problem after a few quick tests. I updated my test script to the below (each fork() creates its own db handle), and letting it run for ~5 minutes, there are no issues. I'll apply similar logic to my real app and see how it goes, then I'll mark this thread solved if the problem vanishes.

      I was looking in all the wrong places for information. I should have done more than just skimmed the docs. sigh.

      use warnings; use strict; use Parallel::ForkManager; use DBI; my $db_file = 'test.db'; my $dbh = DBI->connect( "dbi:SQLite:dbname=$db_file", "", "", {RaiseError => 1} ) or die $DBI::errstr; my $pm = Parallel::ForkManager->new(20); for (1..2){ $pm->start and next; my $interval = $_ / 5; my $dbh = DBI->connect( "dbi:SQLite:dbname=$db_file", "", "", {RaiseError => 1} ) or die $DBI::errstr; while (1){ dbwrite($dbh, $interval); #sleep 1; } } while(1){ my $sth = $dbh->prepare( "SELECT * FROM test WHERE id=?;" ); $sth->execute('test'); my $val = $sth->fetchrow_hashref()->{value}; print "$val\n"; } sub dbwrite { my ($dbh, $interval) = @_; my $sth = $dbh->prepare( 'UPDATE test SET value=? WHERE id="test";' ); $sth->execute($interval); };

      update: I did trigger the below error after some time, but that's a different problem that it appears as though there are built-in workarounds for that I'll play with (mind you, it's probably triggered because we're write/read to the db at full speed):

      DBD::SQLite::st execute failed: database is locked at pm.pl line 49.

      /update

      Thanks davido, I missed that in the docs. I'll open new db handles within the fork(s), and see how that plays out.

      Hey Dave,

      I just want to say that having each process use its own DB handle seemed to have corrected my issues.

      The way this particular app works is there's a core API class (OO) at its centre. There's also a DB class (OO) that manages all of the database work (it creates the actual DB handle in the new() call). The DB handle gets inserted into the DB object. The API is the only thing that talks to the DB object, so the DB object is inserted into the API object.

      So, what I've done is this:

      • Dancer2 app library creates a new API object on instantiation, and this is what the Dancer2 routes use to make calls to and get work done
      • when the API object is first instantiated, it spins up two events that run in separate processes. When these event objects are created, they are passed in a *copy* of the API object (no sharing between procs)
      • the event objects then instantiate their own DB object (which contains a DB handle internally), and inserts that DB object into the API object, overwriting the object that was created during the Dancer2 initial instantiation
      • now the Dancer2 process has its own copy of the API with it's own DB object (and therefore its own DB handle), and each event has the same configuration

      I know it's not very elegant at the moment, but I have quite a good test suite, so after some more testing, I'll be able to clean it up significantly now that I know what the issue was.

      I appreciate the feedback in helping me get this resolved.

      ps. The app has been running for 19+ hours as of now with very consistent memory footprint and CPU usage, all of my timers work properly in a real run, and it seems very stable.