dr_k has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks, i'm trying to write script that has to be able to do following:

1) receive data from IPC SysV message queue
2) find out, if the data are in MySQL database
3) modify the data and send them via xml rpc to the xml rpc server

thing is: a) it has to be multiprocessing b) amount of data inputs is cca 50-100/sec

the problem here is MySQL and multiprocessing, I tried fork() but the children were not able to use one database handle I had to copy it for each child - so it was really slow (8 inputs per second was maximal limit). then I moved to threads and the first thing a realized was that database handle couldn't be shared among threads in DBI. so now I'm trying combination DBIx::Threaded + threads, but I can't make it work right. I can however make it behave two ways:

i) new threads are created with detach() parameter, thread nr.2 succesfully finishes its work and at the same time it destroys the database handle
ii) threads are created without detach(), database handle is shared with no problems until system runs out of memory (because the threads don't die at the end of the subroutine)

here is part of code for variant ii)

#!/usr/bin/perl -w use threads; use DBI; use IPC::SysV qw(IPC_PRIVATE S_IRWXU S_IRWXG S_IRWXO IPC_CREAT); use IPC::Msg; use DBIx::Threaded; use Thread::Queue::Duplex; # SysV IPC message queue $msg = new IPC::Msg(0x4142678a, S_IRWXU | S_IRWXG | S_IRWXO | IPC_CREA +T); $msgtype=2; # creating database handle $dbh = DBIx::Threaded->connect("DBI:mysql:database=$DataBaseName;host= +$DataBaseHost","$DataBaseUser","$DataBasePass", { RaiseError => 1, AutoCommit => 0, }) || die "Unable to connect to $DataBaseHost because $DBI::errstr"; # useless statement handle, there must be at leas one existing otherwi +se $dbh is destroyed after the first cycle $sth_zb = $dbh->prepare("SELECT * FROM testcallapp"); # taking data from message queue while ($msg->rcv($buf,256)) { # printing actually living threads foreach $thr (threads->list) { my $tred = $thr->tid; print "DEBUG: $tred\n"; } # creating enw thread threads->new(\&vlakno, $buf); print "DEBUG: $buf !\n"; } # subroutine for a thread sub vlakno { my ($buffer) = @_; # chopping buffer for data my @sysv = split(/,/, $buffer); my $natv2 = $sysv[0]; my $v2 = $sysv[1]; # statement handle, finding out, if data are in database my $sth = $dbh->prepare("SELECT host FROM ... "); $sth->execute(); $sth->bind_columns(\$host); # writing data from database while($sth->fetch()) { print "DEBUGGER: $host\n"; } # finishing the statement handle $sth->finish; }

So...do you have any ideas how to do it work this way or how to meet script's requirements any other way?

dr_k

Replies are listed 'Best First'.
Re: MySQL multiprocessing
by perrin (Chancellor) on Dec 11, 2007 at 14:47 UTC
    It doesn't make any sense that your forking approach was slow. Do you know what the slow part was? MySQL is very fast at creating new connections, and you can even pre-fork your connections so they are ready to roll when a query comes in.

    If I were doing this and the obvious forking approach was too slow, I would make a mod_perl server that waits for queries and returns the results. You could just send the queries to it over HTTP from your script that reads the queue. This is all basically done for you in the GoferTransport-HTTP module.

      You are right, the real slowdown is the xml rpc client I use (Frontier::Client)... time to try another one I suppose ;) and you are right again that even without it is forking way too faster than threads. so my next step will be fork() and different xml rpc client. thanks ;]