in reply to Re^6: threads::shared seems to kill performance (Workaround).
in thread threads::shared seems to kill performance

If I understand it correctly ... then use the shared_clone( $dbh ) ... Is that even remotely correct?

No. Look again and you'll see that I don't use the share_clone( $dbh ) at all within the threads. I never actually access the second parameter I pass:

my @threads = map threads->create( \&thread, $Q, shared_clone( $dbh ) +), 1 .. $T; #............................................^1, ^^^^^^^^^ 2 ^^^^^^^^

within the thread procedures;

sub thread { my $tid = threads->tid; my( $Q ) = @_;

I'm afraid that this is a 'left-over' -- ie. I should have deleted it before posting; sorry -- from earlier experiments that lead to my discovering the need for the workaround.

The reason we need a workaround is because sqlite :memory: dbs are only visible to the connection that creates them.

But if you share a $dbi, the contents of the dbi (the C/XS bits that the sqlite libraries use) gets mangled internally by DBI such that those libraries see the shared connection from different threads as being different connections; thus the threads see different :memory: dbs.

The workaround is to use the uri-form of the connect string + the modifiers mode=memory&cache=shared to cause sqlite to connect all the (new; unshared) connects made in each of the threads to the same db.

Why do the shared_clone( $dbh )? Isn't the shared memdb enough for the threads to have access to the same memory?

Not applicable :)

What's the purpose of the $Q->nq( ('Wakey wakey') x $T ); ... $Q->dq; ## Wait for DB ?

Why pass the string to the thread and then just dequeue it? And what's the relation to waiting for DB?

Again, this is a left-over from earlier experiments. Its purpose was to block the threads whilst I created the db in the main code as originally I was creating the threads before I created the db. The idea was to prevent the memory used by the DB being cloned when the threads are created.

However, as I'm no longer using a shared connection handle, that precaution is no longer relevant. So, the nqing of the strings and the associated dqing of them can be safely removed.

Here's the latest for of the code with a few tweaks and some timing code added:

Finally, I've still to resolve the undeleted file problem. The fact that a file is being created at all is somewhat disconcerting as it mean that I'm not getting the (purely) memory db I was hoping for; and could mean that performance isn't what it should be as a result. I did try contacting the mailing list about this, but got no useful response. I haven't had time/motivation to pursue it further.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^8: threads::shared seems to kill performance (Workaround).
by Jacobs (Novice) on Jul 24, 2013 at 20:29 UTC

    Ah, perfect, now I get it! Thank you very much :-)

    That file thing is a bit strange indeed, I hope there's some solution to that.

      That file thing is a bit strange indeed, I hope there's some solution to that.

      The problem is finding someone with some expertise of the Windows port of sqlite.

      BTW: As your data is already in an sqllite DB, rather than copying it all from the into this mode=memory DB(*), why not just open connections to the existing db file from each of the threads using the uri-form connect string + the cache=shared modifier and see how the performance pans out.

      (*especially as it seems these involve IO anyway.)


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        Actually now that I'm looking into this more deeply, turns out I'm already doing exactly that. At least on OS X. There I can't get that :memory: to work with ?cache=shared, so it always uses a file in the end. I get this with your latest code:

        $ ls -l file* ls: file*: No such file or directory $ ./test.pl > /dev/null $ ls -l file* -rw-r--r-- 1 19456 Jul 24 22:46 file:memdb2?mode=memory&cache=shared $ ./test3.pl > /dev/null $ ls -l file* -rw-r--r-- 1 36864 Jul 24 22:46 file:memdb2?mode=memory&cache=shared

        When I use 'dbi:SQLite:dbname=:memory:' alone, it creates the database in-memory (no file), but I can't share it so that's no use.

        I've also tried all of these different URIs, but no joy (below are the corresponding files that get created):

        CONNECT=> 'dbi:SQLite:dbname=file:memory:?mode=memory&cache=shared' 19456 Jul 24 23:15 file:memory:?mode=memory&cache=shared CONNECT=> 'dbi:SQLite:dbname=:memory:?cache=shared' 19456 Jul 24 23:18 :memory:?cache=shared CONNECT=> 'dbi:SQLite:dbname=:memory:\?cache=shared' 19456 Jul 24 23:19 :memory:\?cache=shared CONNECT=> 'dbi:SQLite:dbname=:memory:cache=shared' 19456 Jul 24 23:20 :memory:cache=shared CONNECT=> 'dbi:SQLite:dbname=:memory?cache=shared:' 19456 Jul 24 23:21 :memory?cache=shared:

        So far when I use the file it's fast enough for what I need. But just out of curiosity, any chance that ?cache=share can be set outside of the URI in some way?