in reply to Re: threads::shared referencing confusion
in thread threads::shared referencing confusion

yes, that is the SQLite error point but the actual error seems to me to have happened much earlier. While trying to get the original DB handle into a shared hash things clearly go wrong. The handle is changed to a reference and SQLite is clearly erroring on the reference as it is not the open DB handle. This is what I don't understand, how do I get the original handle into a shared Hash like I do in a non-shared Hash?

SO show, at least thus far, the error isn't with SQLite a bit of code changed to demonstrate;

use DBI; use threads; use threads::shared; my $driver = 'SQLite'; my $database = ''; my $dsn = ''; my $userid = ''; my $password = ''; my $dbh; my $dsn; my $stmt; my $sth; my $rv; my @row; my %hash; my %hsh :shared; my $int :shared; my $String; my $Sdatabase; $Sdatabase = $ARGV[0]; print 'Database fully qualified path of existing SQLite DB -> ' . $Sda +tabase . "\n"; $dsn = "DBI:$driver:dbname=$Sdatabase"; print 'Database data source name for connect -> ' . $dsn . "\n"; $dbh = DBI->connect($dsn, $userid, $password, { PrintError => 1, Ra +iseError => 1, AutoCommit => 0, ReadOnly => 1 }); print 'Simple database handle in $dbh from connect -> ' . $dbh . " and + DB is now open\n"; $hash{"$Sdatabase"} = $dbh; print 'Simple database handle in non-shared Hash, $Sdatabase -> ' . $S +database . ' and $hash{"$Sdatabase"} -> ' . $hash{"$Sdatabase"} . "\ +n"; $stmt = qq(select count ( * ) from asnipv4 ); $sth = $hash{"$Sdatabase"}->prepare( $stmt ); $rv = $sth->execute(); while( @row = $sth->fetchrow_array()) { $Iknt++; $String = join ',', @row; print "$Iknt -> $String \n"; } $hash{$Sdatabase}->disconnect(); delete $hash{$Sdatabase}; # $int = $dbh; #print 'Try to get simple database handle in shared scalar $int as -> + ' . $int . "\n"; $int = shared_clone(\$dbh); print 'Try to get simple database handle in shared scalar $int via sha +red_clone(\$dbh) -> ' . $int . " <- the original handle is gone and n +ow we have a reference\n"; $hash{"$Sdatabase"} = $int; print 'Try to get shared scalar $int database handle into a non-shared + Hash $Sdatabase -> ' . $Sdatabase . ' and $hash{"$Sdatabase"} -> ' +. $hash{"$Sdatabase"} . " <- works but we have a reference and not th +e handle\n"; $hsh{"$Sdatabase"} = $int; print 'Try to get shared scalar $int database handle into a shared Has +h $Sdatabase -> ' . $Sdatabase . ' has $hsh{"$Sdatabase"} -> ' . $hsh +{"$Sdatabase"} . " <- BUT the simple = results in some other referenc +e\n"; $stmt = qq(select count ( * ) from asnipv4 ); $sth = $hsh{"$Sdatabase"}->prepare( $stmt ); $rv = $sth->execute(); while( @row = $sth->fetchrow_array()) { $Iknt++; $String = join ',', @row; print "$Iknt -> $String \n"; } $hsh{$Sdatabase}->disconnect(); delete $hsh{$Sdatabase}; exit;

yields;

[root@itx2 ~]# perl x7.pl /opt/asd/sql/dbi-indexed-GeoLite2.db-sqlite- +old Database fully qualified path of existing SQLite DB -> /opt/asd/sql/db +i-indexed-GeoLite2.db-sqlite-old Database data source name for connect -> DBI:SQLite:dbname=/opt/asd/sq +l/dbi-indexed-GeoLite2.db-sqlite-old Simple database handle in $dbh from connect -> DBI::db=HASH(0x561e4b5e +5480) and DB is now open Simple database handle in non-shared Hash, $Sdatabase -> /opt/asd/sql/ +dbi-indexed-GeoLite2.db-sqlite-old and $hash{"$Sdatabase"} -> DBI::d +b=HASH(0x561e4b5e5480) 1 -> 511665 Try to get simple database handle in shared scalar $int via shared_clo +ne(\$dbh) -> REF(0x561e4b2b9570) <- the original handle is gone and n +ow we have a reference Try to get shared scalar $int database handle into a non-shared Hash $ +Sdatabase -> /opt/asd/sql/dbi-indexed-GeoLite2.db-sqlite-old and $ha +sh{"$Sdatabase"} -> REF(0x561e4b2b9570) <- works but we have a refere +nce and not the handle Try to get shared scalar $int database handle into a shared Hash $Sdat +abase -> /opt/asd/sql/dbi-indexed-GeoLite2.db-sqlite-old has $hsh{"$S +database"} -> REF(0x561e4b5e5630) <- BUT the simple = results in some + other reference Can't call method "prepare" on unblessed reference at x7.pl line 66. [root@itx2 ~]#

It seems to me the problem is clearly that the shared Hash isn't getting the DB handle and I cannot figure out from the documentation what syntax is needed to get something 'outside the shared world', i.e. the SQLite DB handle, 'into the shared world'.

Once I get past that point then the prepare at line 66 will either work or still fail.

On a separate note you point on the DB handle is well taken. There won't actually be any concurrent DB handle access between threads as I implement a shared spin-lock mechanism to signal whether or not one of the threads is operating on the DB handle and only the thread that acquires the spin-lock operates on that DB handle (of course releasing the spin-lock when it is done). So far in 'many' threads not using sharing (and thusly each actually having their own unique local in scope to the subroutine that does the work) I still use the spin-lock mechanism to single single access per DB handle to achieve non-concurrent actions (select at this time but insert or replace code is done but not yet being tested/verified because I am chasing the sharing of a single DB handle Hash error right now). Other mechanisms work similarily to signal a DB handle should-be/was released, LRU to allow a DB handle not accessed 'recently' (some other code doing housekeeping on open handles) being eligible for closure, etc which allows me to control how many handles can be open at any given time and being driven by LRU to handle opened/needing-open. Why all this? 1) to remove redundant/repetative DB open/close (always open == ready to be worked on right now), 2) many threads buring through many open handles to achieve higher query (and later insert or replace) performance. There are actual time-series DB's holding a range of 'rows' but I have all the partitioning code working which limits the size of a time-series DB chunk/partition, i.e. SQLite DB's do not grow in size in an uncontrolled way (they are time range and size limited (partitioned) (in some simple to think about way like an Oracle RDBMS with field based partitioning), and the various code based that add-replace/delete/query all find their way throught the range of partitions they need to satisfy their function add-replace/delete/query...

So, yes your caution is well taken and having 'worried' about it early on I am working to 'code around' the lack of SQLite concurrency (while not complaining that SQLite should have some sort of concurrency).

Code and paragraph tags added by Grandfather. See Markup in the Monastery

Replies are listed 'Best First'.
Re^3: threads::shared referencing confusion
by Marshall (Canon) on Aug 22, 2024 at 07:26 UTC
    "lack of SQLite concurrency" - SQLite has plenty of concurrency for reading see the docs: SQLite Threading Modes. I would prototype your testing app with the default Serialization mode and see what performance results.

    I ran a short benchmark to see how fast the open/close function is. On my > 10-year-old machine, more than 2,000 per second. Your machine is probably much faster than mine.

    You should be aware that a prepare operation can be slow. If you are stashing DBhandles, then for performance reasons, you might need to stash the prepared SQL queries for at least common situations. Note prepare is table specific.

    I was shocked at this spin-lock idea. That is almost certainly not what you should do. Here is a description from Intel spin-locks harmful. In the first instance don't worry about this and let SQLite serialize things and see how fast it is. Then probably for multi-thread in your scheme, you need flock mechanism.

    How big is your DB? You should be aware that it is possible to dynamically manage the amount of memory that SQLite uses. I have tested this and actually used it in one of my applications. I was creating a heavily indexed DB. I inserted all the rows, then ran RAM usage up and did the indexing en masse for a huge performance gain. If all or most of the DB is memory resident, this cuts down on disk access. You could experiment with different sizes and let SQLite decide what to put into memory and when. Of course, once your app becomes more memory and CPU-bound, it won't make much difference to have more threads than you have physical cores.

    My advice is to do some benchmarking and then let that guide the design decisions.

    use strict; use warnings; use Time::HiRes qw( gettimeofday ); use DBI; my $dbfile = "testDB.sqlite"; my $start = gettimeofday; for (0..10000){ my $dbh = DBI->connect("dbi:SQLite:dbname=$dbfile","","",{RaiseErro +r => 1}) or die "Couldn't connect to database: " . DBI->errstr; $dbh->disconnect; } my $end=gettimeofday; print "",($end-$start),"\n"; #4.54577684402466 # so approximately 2,000 per second
Re^3: threads::shared referencing confusion
by ikegami (Patriarch) on Aug 22, 2024 at 04:51 UTC

    It seems to me the problem is clearly that the shared Hash isn't getting the DB handle

    Yup. Just like I said. You can't share a DB handle, so you can't place it in a shared variable. On top of explaining this, I told you what you must do.

    On a separate note you point on the DB handle is well taken. There won't actually be any concurrent DB handle access [...]

    Apparently not, since I didn't talk about concurrency at all! So I didn't make whatever point you think I made about concurrency.

Re^3: threads::shared referencing confusion
by Marshall (Canon) on Aug 21, 2024 at 21:03 UTC
    I cannot read your code - you need code tags for formatting.

    If I understand ikegami correctly, you can only use a DB handle in the thread that created it. Regardless of any concurrent operations.

    Perhaps you can back up and explain your application a bit better? Also be aware that write performance of SQLite is very slow in a multi-writer scenario because the writer must obtain an exclusive lock on the DB file. Concurrent reads from multiple threads is fine. Each thread must have its own DB handle.

    Update: I would recommend that you come up with a simple model of what you expect your application to be doing. You can fiddle with this to run various benchmarks, etc. I would let the DB handle all of the file locking. There is no need for you to serialize the requests yourself. Again, I am a bit lost because I don't understand your app.

      If I understand ikegami correctly, you can only use a DB handle in the thread that created it. Regardless of any concurrent operations.

      I didn't say that. It might be true, but it's not what I said.

      I said you couldn't share one.

      So how could multiple threads have a copy of the handle if it can't be shared? Through the cloning process that happens a new thread is created.

      my $x = 123; ( async { $x = 456; say $x; # 456 } )->join(); say $x; # 123

      There are two $x since the entire interpreter is cloned when it creates a thread.

      There's even a callback that allows libraries with external resources to clone themselves. Whether DBI and DBD::SQLite takes advantage of that or not is unknown to me. But I find it unlikely. How do you clone a database connection? So it's probably true that one can only use DB handle in the thread that created it.

        You wrote:
        my $x = 123; ( async { $x = 456; say $X; # 456 } )->join(); say $x; # 123
        but I don't think you tried it; on the line you annotate with # 456, you probably mean to say $x, but you put say $X. Tragically, Perl is case-sensitive when it comes to variables.
      yikes, somehow that all sure got messed up into an unreadable blob of characters!

        and you didn't notice before you clicked the "create" button in the preview page for the comment?

        For future reference see Markup in the Monastery. Also note the following text just above the edit field you entered your comment into:

        Use: <p> text here (a paragraph) </p> and: <code> code here </code>

        and the helpful material immediately below the edit field that emphasizes the use of paragraph tags and code tags.

        Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond
Re^3: threads::shared referencing confusion
by etj (Priest) on Aug 21, 2024 at 20:00 UTC
    This needs <code> tags around the code here, which is unreadable currently.