MySQL multiprocessing

dr_k has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks, i'm trying to write script that has to be able to do following:

1) receive data from IPC SysV message queue
2) find out, if the data are in MySQL database
3) modify the data and send them via xml rpc to the xml rpc server

thing is: a) it has to be multiprocessing b) amount of data inputs is cca 50-100/sec

the problem here is MySQL and multiprocessing, I tried fork() but the children were not able to use one database handle I had to copy it for each child - so it was really slow (8 inputs per second was maximal limit). then I moved to threads and the first thing a realized was that database handle couldn't be shared among threads in DBI. so now I'm trying combination DBIx::Threaded + threads, but I can't make it work right. I can however make it behave two ways:

i) new threads are created with detach() parameter, thread nr.2 succesfully finishes its work and at the same time it destroys the database handle
ii) threads are created without detach(), database handle is shared with no problems until system runs out of memory (because the threads don't die at the end of the subroutine)

here is part of code for variant ii)

#!/usr/bin/perl -w
use threads;
use DBI;
use IPC::SysV qw(IPC_PRIVATE S_IRWXU S_IRWXG S_IRWXO IPC_CREAT);
use IPC::Msg;
use DBIx::Threaded;
use Thread::Queue::Duplex;

# SysV IPC message queue
$msg = new IPC::Msg(0x4142678a, S_IRWXU | S_IRWXG | S_IRWXO | IPC_CREA
+T);
$msgtype=2;

# creating database handle
$dbh = DBIx::Threaded->connect("DBI:mysql:database=$DataBaseName;host=
+$DataBaseHost","$DataBaseUser","$DataBasePass",
{
RaiseError => 1,
AutoCommit => 0,
}) || die "Unable to connect to $DataBaseHost because $DBI::errstr";

# useless statement handle, there must be at leas one existing otherwi
+se $dbh is destroyed after the first cycle
$sth_zb = $dbh->prepare("SELECT * FROM testcallapp");

# taking data from message queue
while ($msg->rcv($buf,256)) {

# printing actually living threads
foreach $thr (threads->list) {
my $tred = $thr->tid;
print "DEBUG: $tred\n";
}

# creating enw thread
threads->new(\&vlakno, $buf);
print "DEBUG: $buf !\n";
}

# subroutine for a thread
sub vlakno {
my ($buffer) = @_;
# chopping buffer for data
my @sysv = split(/,/, $buffer);
my $natv2 = $sysv[0];
my $v2 = $sysv[1];

# statement handle, finding out, if data are in database
my $sth = $dbh->prepare("SELECT host FROM ... ");
$sth->execute();
$sth->bind_columns(\$host);

# writing data from database
while($sth->fetch()) {
print "DEBUGGER: $host\n";
}
# finishing the statement handle
$sth->finish;
}
[download]

So...do you have any ideas how to do it work this way or how to meet script's requirements any other way?

dr_k

Comment on MySQL multiprocessing Download Code

Replies are listed 'Best First'.
Re: MySQL multiprocessing by perrin (Chancellor) on Dec 11, 2007 at 14:47 UTC
It doesn't make any sense that your forking approach was slow. Do you know what the slow part was? MySQL is very fast at creating new connections, and you can even pre-fork your connections so they are ready to roll when a query comes in. If I were doing this and the obvious forking approach was too slow, I would make a mod_perl server that waits for queries and returns the results. You could just send the queries to it over HTTP from your script that reads the queue. This is all basically done for you in the GoferTransport-HTTP module.	[reply]
Re^2: MySQL multiprocessing by dr_k (Novice) on Dec 11, 2007 at 17:55 UTC
You are right, the real slowdown is the xml rpc client I use (Frontier::Client)... time to try another one I suppose ;) and you are right again that even without it is forking way too faster than threads. so my next step will be fork() and different xml rpc client. thanks ;]	[reply]