seaver has asked for the wisdom of the Perl Monks concerning the following question:
I've profiled a script of mine only to find that the biggest culprit in slowing down the script, is the function below.
As you can see, I've several tables, each of which are connected, in 'descending' order, by unique IDs from the table above.
So every model as the correct atom id, every atom has the correct residue id, and every residue has the correct chain id.
Every line I parse from a file, has all the data for all four tables in the same line, so I can do all 4 tables in one function. But the problem is that I want to be able to NOT query for any of the unique Ids.
But, I have to write an application that will be able to resume progress if it crashes or if the computer needs booting etc. so if it goes over any of the same files, the data is already there in the database, but how can I speed up the whole function, because every call is worth 0.000637 seconds, and that's huge, especially regarding the sheer volume of calls:
%Time Sec. #calls sec/call F name 32.16 20144.1603 31635687 0.000637 DBI::st::executehere's the function:
Cheerssub addAtom{ my $self = shift; my $pdb = shift; my $atom = shift; my $ch = $atom->chainId(); my $dbh = DBI->connect($dbi,$u,$p,{'RaiseError' => 1}); my $cid = $self->getCID($pdb,$atom->chainId(),$dbh); my ($rid,$aid,$mid,$res,$ato); if(!$cid){ $dbh->do("INSERT INTO chain (pdb,chain) VALUES ('$pdb','$ch')" +); $cid = $self->getCID($pdb,$atom->chainId(),$dbh); } if($cid){ $rid = $self->getRID($cid,$atom->resNumber,$atom->resName,$dbh +); if(!$rid){ $res=$dbh->quote($atom->resName()); $dbh->do("INSERT INTO residue (cid,rnumber,rname) VALUES ( +'$cid','".$atom->resNumber."',$res)"); $rid = $self->getRID($cid,$atom->resNumber,$atom->resName, +$dbh); } if($rid){ $aid = $self->getAID($rid,$atom->atomName,$dbh); if(!$aid){ $ato=$dbh->quote($atom->atomName()); $dbh->do("INSERT INTO atom (rid,aname) VALUES ('$rid', +$ato)"); $aid = $self->getAID($rid,$atom->atomName,$dbh); } if($aid){ $mid = $self->getMID($aid,$atom->model,$dbh); if(!$mid){ $dbh->do("INSERT INTO model (aid,model,x,y,z) VALU +ES ('$aid','".$atom->model."','".$atom->x."','".$atom->y."','".$atom- +>z."')"); $mid = $self->getMID($aid,$atom->model,$dbh); } } } } $dbh->disconnect(); }
UPDATE:
Thanks for the replies, I had actually broached the subject of avoiding repeated connections in this thread.
I think the best solution to my problem is simply to use stored procedures, and also to keep a hash of the 'current' ids.
I have one more question though: Is it possible to get the perl DBI or MYSQL to return the newly created ID, so that I dont have to re-query for the id itself?
thanks
Sam
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Avoiding too many DB queries
by shemp (Deacon) on Jun 17, 2004 at 16:22 UTC | |
|
Re: Avoiding too many DB queries
by iburrell (Chaplain) on Jun 17, 2004 at 16:26 UTC | |
|
Re: Avoiding too many DB queries
by EvdB (Deacon) on Jun 17, 2004 at 17:04 UTC | |
|
Re: Avoiding too many DB queries
by shemp (Deacon) on Jun 17, 2004 at 18:26 UTC | |
|
Re: Avoiding too many DB queries
by Plankton (Vicar) on Jun 17, 2004 at 16:33 UTC | |
|
Re: Avoiding too many DB queries
by jeffa (Bishop) on Jun 18, 2004 at 13:05 UTC | |
by seaver (Pilgrim) on Jun 18, 2004 at 17:23 UTC | |
by eric256 (Parson) on Jun 18, 2004 at 17:42 UTC |