in reply to threads::shared referencing confusion
This is the failing line: $hsh{"$Sdatabase"} = $dbh;
A database handle is a complex thing. It may reference file components and external library resources. They can't be shared. (The specific violation is that $dbh is a reference, but it doesn't reference something shared.)
Create the database handle in the thread that will use it.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: threads::shared referencing confusion
by Anonymous Monk on Aug 21, 2024 at 17:59 UTC | |
yes, that is the SQLite error point but the actual error seems to me to have happened much earlier. While trying to get the original DB handle into a shared hash things clearly go wrong. The handle is changed to a reference and SQLite is clearly erroring on the reference as it is not the open DB handle. This is what I don't understand, how do I get the original handle into a shared Hash like I do in a non-shared Hash? SO show, at least thus far, the error isn't with SQLite a bit of code changed to demonstrate;
yields;
It seems to me the problem is clearly that the shared Hash isn't getting the DB handle and I cannot figure out from the documentation what syntax is needed to get something 'outside the shared world', i.e. the SQLite DB handle, 'into the shared world'. Once I get past that point then the prepare at line 66 will either work or still fail. On a separate note you point on the DB handle is well taken. There won't actually be any concurrent DB handle access between threads as I implement a shared spin-lock mechanism to signal whether or not one of the threads is operating on the DB handle and only the thread that acquires the spin-lock operates on that DB handle (of course releasing the spin-lock when it is done). So far in 'many' threads not using sharing (and thusly each actually having their own unique local in scope to the subroutine that does the work) I still use the spin-lock mechanism to single single access per DB handle to achieve non-concurrent actions (select at this time but insert or replace code is done but not yet being tested/verified because I am chasing the sharing of a single DB handle Hash error right now). Other mechanisms work similarily to signal a DB handle should-be/was released, LRU to allow a DB handle not accessed 'recently' (some other code doing housekeeping on open handles) being eligible for closure, etc which allows me to control how many handles can be open at any given time and being driven by LRU to handle opened/needing-open. Why all this? 1) to remove redundant/repetative DB open/close (always open == ready to be worked on right now), 2) many threads buring through many open handles to achieve higher query (and later insert or replace) performance. There are actual time-series DB's holding a range of 'rows' but I have all the partitioning code working which limits the size of a time-series DB chunk/partition, i.e. SQLite DB's do not grow in size in an uncontrolled way (they are time range and size limited (partitioned) (in some simple to think about way like an Oracle RDBMS with field based partitioning), and the various code based that add-replace/delete/query all find their way throught the range of partitions they need to satisfy their function add-replace/delete/query... So, yes your caution is well taken and having 'worried' about it early on I am working to 'code around' the lack of SQLite concurrency (while not complaining that SQLite should have some sort of concurrency). Code and paragraph tags added by Grandfather. See Markup in the Monastery | [reply] [d/l] [select] |
by Marshall (Canon) on Aug 22, 2024 at 07:26 UTC | |
I ran a short benchmark to see how fast the open/close function is. On my > 10-year-old machine, more than 2,000 per second. Your machine is probably much faster than mine. You should be aware that a prepare operation can be slow. If you are stashing DBhandles, then for performance reasons, you might need to stash the prepared SQL queries for at least common situations. Note prepare is table specific. I was shocked at this spin-lock idea. That is almost certainly not what you should do. Here is a description from Intel spin-locks harmful. In the first instance don't worry about this and let SQLite serialize things and see how fast it is. Then probably for multi-thread in your scheme, you need flock mechanism. How big is your DB? You should be aware that it is possible to dynamically manage the amount of memory that SQLite uses. I have tested this and actually used it in one of my applications. I was creating a heavily indexed DB. I inserted all the rows, then ran RAM usage up and did the indexing en masse for a huge performance gain. If all or most of the DB is memory resident, this cuts down on disk access. You could experiment with different sizes and let SQLite decide what to put into memory and when. Of course, once your app becomes more memory and CPU-bound, it won't make much difference to have more threads than you have physical cores. My advice is to do some benchmarking and then let that guide the design decisions.
| [reply] [d/l] |
by ikegami (Patriarch) on Aug 22, 2024 at 04:51 UTC | |
Yup. Just like I said. You can't share a DB handle, so you can't place it in a shared variable. On top of explaining this, I told you what you must do.
Apparently not, since I didn't talk about concurrency at all! So I didn't make whatever point you think I made about concurrency. | [reply] |
by Marshall (Canon) on Aug 21, 2024 at 21:03 UTC | |
If I understand ikegami correctly, you can only use a DB handle in the thread that created it. Regardless of any concurrent operations. Perhaps you can back up and explain your application a bit better? Also be aware that write performance of SQLite is very slow in a multi-writer scenario because the writer must obtain an exclusive lock on the DB file. Concurrent reads from multiple threads is fine. Each thread must have its own DB handle. Update: I would recommend that you come up with a simple model of what you expect your application to be doing. You can fiddle with this to run various benchmarks, etc. I would let the DB handle all of the file locking. There is no need for you to serialize the requests yourself. Again, I am a bit lost because I don't understand your app. | [reply] |
by ikegami (Patriarch) on Aug 22, 2024 at 04:49 UTC | |
I didn't say that. It might be true, but it's not what I said. I said you couldn't share one. So how could multiple threads have a copy of the handle if it can't be shared? Through the cloning process that happens a new thread is created.
There are two $x since the entire interpreter is cloned when it creates a thread. There's even a callback that allows libraries with external resources to clone themselves. Whether DBI and DBD::SQLite takes advantage of that or not is unknown to me. But I find it unlikely. How do you clone a database connection? So it's probably true that one can only use DB handle in the thread that created it. | [reply] [d/l] [select] |
by etj (Priest) on Aug 22, 2024 at 14:36 UTC | |
by ikegami (Patriarch) on Aug 22, 2024 at 14:38 UTC | |
by Anonymous Monk on Aug 21, 2024 at 21:15 UTC | |
| [reply] |
by GrandFather (Saint) on Aug 21, 2024 at 21:24 UTC | |
by Anonymous Monk on Aug 21, 2024 at 23:47 UTC | |
by etj (Priest) on Aug 21, 2024 at 20:00 UTC | |
| [reply] [d/l] |