in reply to Re^2: mutiple threading
in thread mutiple threading

as per the suggestions by perlCrazy, I think a possible thread safe re-write of above code can be

use threads; use DBI; $SQL_1 = "SELECT * FROM x"; $SQL_2 = "SELECT * FROM y"; $SQL_3 = "SELECT * FROM z"; sub start_thread { my ($SQL) = @_; my $dbh = DBI->connect( <your connection string> ); # print('Thread started: ', SQL, "\n"); my $sth = $dbh->prepare($SQL); $sth->execute(); < Add your code here > } my $thr1 = threads->create('start_thread', $SQL_1); my $thr2 = threads->create('start_thread', $SQL_2); my $thr3 = threads->create('start_thread', $SQL_3); $thr1->join(); $thr2->join(); $thr3->join();



There are three sides to any argument.....
your side, my side and the right side.

Replies are listed 'Best First'.
Re^4: mutiple threading
by BrowserUk (Patriarch) on Aug 09, 2007 at 16:05 UTC
    But BEWARE, some underlying database APIs (the code the DBD driver uses to talk to the database, often supplied by the database vendor) are not thread safe. If it's not thread safe, then allowing more than one thread to enter the code at the same time may cause subtle/serious problems. In some cases allowing more than one thread to enter the code, even if not at the same time, can cause problems. You have been warned.

    Sorry to harp on, but unless the versions of DBI being used by perlcrazy and the drivers that underlie it have been explicitly declared thread safe, your example may or may not work. May or may not give silently corrupt results.

    DBI from one thread is safe. Multiple threads at a time may not be.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      Hi,

      I found the following in CPAN

      Making your module threadsafe

      Since 5.6.0, Perl has had support for a new type of threads called interpreter threads (ithreads). These threads can be used explicitly and implicitly.

      Ithreads work by cloning the data tree so that no data is shared between different threads. These threads can be used by using the threads module or by doing fork() on win32 (fake fork() support). When a thread is cloned all Perl data is cloned, however non-Perl data cannot be cloned automatically. Perl after 5.7.2 has support for the CLONE special subroutine. In CLONE you can do whatever you need to do, like for example handle the cloning of non-Perl data, if necessary. CLONE will be called once as a class method for every package that has it defined (or inherits it). It will be called in the context of the new thread, so all modifications are made in the new area. Currently CLONE is called with no parameters other than the invocant package name, but code should not assume that this will remain unchanged, as it is likely that in future extra parameters will be passed in to give more information about the state of cloning.

      If you want to CLONE all objects you will need to keep track of them per package. This is simply done using a hash and Scalar::Util::weaken().

      Perl after 5.8.7 has support for the CLONE_SKIP special subroutine. Like CLONE, CLONE_SKIP is called once per package; however, it is called just before cloning starts, and in the context of the parent thread. If it returns a true value, then no objects of that class will be cloned; or rather, they will be copied as unblessed, undef values. This provides a simple mechanism for making a module threadsafe; just add sub CLONE_SKIP { 1 } at the top of the class, and DESTROY() will be now only be called once per object. Of course, if the child thread needs to make use of the objects, then a more sophisticated approach is needed.

      Like CLONE, CLONE_SKIP is currently called with no parameters other than the invocant package name, although that may change. Similarly, to allow for future expansion, the return value should be a single 0 or 1 value.

      So do you think I can make my DBI thread safe ? please let me know your thoughts. I am asking about threadsafty of DBI and not with respect to above program.

      Thanks in advance.

      --VC



      There are three sides to any argument.....
      your side, my side and the right side.

        I can only direct you to the paragraph labelled "Thread Safety" in the pod for DBI. From which I'll just quote a selective highlight.

        But BEWARE, .... You have been warned.

        The passage you quoted is aimed at the authors of modules, not users. Unless you are the author of DBI, and the DB-specific drivers that run on top of it, and you are very knowledgable about the internals of the vendor-supplied DB interface drivers, you should probably not be trying to "make my DBI thread safe". Accept the limitation and consider other ways of tackling your problem.

        The first thing to consider is: why do you want to access the database from multiple threads in the first place? If you can describe the application that you envisage using this for, then it may be demonstrable that there would be no benefit to having multiple threads accessing the DB. It may be better to have a single thread that fetches the data and then mutiple threads to process it.

        Without specific scenarios, it's not at all clear whether there would be any benefit to doing this.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re^4: mutiple threading
by Anonymous Monk on Sep 03, 2007 at 10:59 UTC
    Hi, I agree with the above suggestion. if only three connections are required, above code will work fine. But, let us consider some situation when 100-1000 connection will be made, in that case only part of connection will be working fine, other DB connection will be rejected due to more connections. Please suggest some way to solve this.....

      Hi,

      Most important thing is why do you need 100s of connections from same program to the same database? You need to re think about the architecture of the program as DB connections and their management are one among the most costly part of a program. We always try to have minimum number of connections to the database. Thats why we go for Apache::DBI - persistent connection. With this we share same database connection between more than one HTTP connections! ie we don't even have a new DB connection for every request, but we are reusing same database connection.

      Its always better to use a single thread (with single db connection) or a very few(<5) threads to deal with all database queries and manipulate the data in other 100s of threads as BrowserUk suggests.

      In this same thread itself, one of the most experienced monk BrowserUk is asking why do we need 3 connections (Re^6: mutiple threading). Then I cant even imagine about 1000 connections

      to have 100 - 1000 of connections, set the maximum number of allowed connections to your database (in your database configuration) to multiple of the actual number of connections per an instance. Also consider upgrading your systems to multiply CPUs and multiply your RAM to some values in GBs. Imagine your program has 200 connections and some 5 instance run at same time, you need 1,000 connections! (?) So re-think about the arch.

      The solution I suggested is valid only with all faults pointed out by BrowserUk. My solution just "an answer to the question" without considering the problems it can bring.

      Cheers !

      --VC



      There are three sides to any argument.....
      your side, my side and the right side.