Annirak has asked for the wisdom of the Perl Monks concerning the following question:

I'm spawning a number of worker threads via Thread::Pool:: Simple. I want to have each of them connect to a database in &init() or &pre() and commit their changes in &post(). I don't see how to create a thread local variable using Thread::Pool::Simple.

The best solution I've come up with so far is to create a shared hash, the place the database handle into the hash --with the thread ID as the key--in pre, and delete it in post, after committing the changes.

Is there a better way?

  • Comment on Thread local variables in Thread::Pool::Simple

Replies are listed 'Best First'.
Re: Thread local variables in Thread::Pool::Simple
by ikegami (Patriarch) on Sep 17, 2009 at 21:00 UTC

    I don't see how to create a thread local variable using Thread::Pool::Simple.

    The best solution I've come up with so far is to create a shared hash

    Hint: If you want a thread-local variable, don't start by taking a thread-local variable and making it shared.

    All variables are thread-local unless you explicitly share them.

    my $dbh; sub init { $dbh = DBI->connect(...); ... } my $pool = Thread::Pool::Simple->new( ... init => [ \&init ], ... );

      What's with the double assignment?

      $dbh = $dbh = DBI->connect(...);

      A typo, or ...?


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        Copy and paste bug. Fixed.

      Thanks, that "handled" my problem nicely.

      Now I have a new problem, though: I'm building a file parser and I wanted to implement fairly granular multithreading by offloading individual file jobs to the worker threads. I had hoped that while one thread was waiting for a file to be read in, another would be able to parse a file that was already loaded.

      I guess that multithreading isn't the solution because the parse of my file tree used to take ~200 seconds, and with 4 worker threads, it now takes ~690 seconds. Perhaps multiprocessing would work better. I know that the process isn't disk limited because the CPU useage for the parsing process sits at ~100%.

      This leads to the next question: Is there a multi-processing equivalent to Thread::Pool::Simple, which encapsulates IPC? Or do I need to setup my own process pool manager with unix pipes/tcp streams? Perhaps this could take advantage of the multi-processor machine that this is running on (2x opteron 252).

      Thanks,
      Annirak

        Don't threads use any available CPUs? Maybe not.

        Anyway, since Thread::Pool::Simple uses threads, the simplest solution it just to add use forks; early in your program. (Before use Thread::Pool::Simple;, at least.)

        By the way, how often do you end up connecting to the database? Maybe the parent should do all the database stuff. ( Why does a parser even need a database? )

        I wonder how well profilers deal with threads. ( "Devel::NYTProf is not currently thread safe." doh! )

        I guess that multithreading isn't the solution because the parse of my file tree used to take ~200 seconds, and with 4 worker threads, it now takes ~690 seconds.

        Care to show us the code and see if we cannot improve that for you?


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
        As ikegami alludes to in a later post, SQLite has some issues with concurrency. There is no 'row-locking', or even 'table-locking'. Any thread that begins a write locks out all others until it is done. This would be the same in multi-thread vs multi-process.
      On a beginner's tangent, how would you call init() from $pool in the code above? My best guess is:

      &{@{$pool->init}[0]}

      $pool->init returns the arrayref which is dereferenced and accessed with enclosing @{} and trailing [0], returning the coderef, which is in turn dereferenced with enclosing &{}?

        You can't. The object doesn't not provide a means of accessing the value you pass to init. But if it provided an accessor, I'm with Anonymous Monk: use arrows.

        Say the accessor is named "init",

        $pool->init # Returns the array ref passed to the constructor. ->[0] # Access to the first element of the referenced array. ->(); # Call the referenced sub.
Re: Thread local variables in Thread::Pool::Simple
by Illuminatus (Curate) on Sep 17, 2009 at 21:04 UTC
    I guess what you are really asking is 'how can I create a thread-local variable that is accessible from 'pre', 'do' and 'post'. Why are you not doing everything inside the 'do' function?
      It didn't make a lot of sense to open and close a sqlite database handle for every job that comes through. I figured that each thread should have a sqlite handle that is persistent over the life of the thread.