in reply to How do I improve the effeciency of my script

The first reply is very important -- definitely prepare each statement once, and use it multiple times.

Apart from that, if you don't have a really vast quantity of distinct "$project_name" or "$oligos_type_id" values, you might consider loading all of these into a hash before you start reading the data file:

my $proj_id_sth = $dbh->prepare( "select project_name,project_id from +projects" ); $proj_id_sth->execute; my %proj_name_id; while ( my $row = $proj_id_sth->fetchrow_arrayref ) { my ( $pr_name, $pr_id ) = @$row; $proj_name_id{$pr_name} = $pr_id; }
and similarly for the "oligos_type_name / oligos_type_id" mapping (update: that is, using a separate but similar %oligos_name_id hash). That way, you execute each of those queries just once before reading the file, and then as you go ahead and read the file, you are just using each "name" value as a hash key to look up the "id" value for your insert.

One more thing to consider: instead of doing the inserts via DBI in your script, you could just output the rows of field values to be inserted to a tab-delimited (or other suitable) data file, then use your DB engine's native "import" or "loader" utility (sqlload for oracle, mysqlimport for mysql, etc) to apply the inserts based on your output list. That will be significantly faster than doing inserts with Perl/DBI. The difference becomes dramatic when you are looking at thousands of inserts in a single run.

Replies are listed 'Best First'.
Re^2: How do I improve the effeciency of my script
by lomSpace (Scribe) on Jul 22, 2009 at 21:05 UTC
    Thanks for the knowledge!