Re: PERL DB Optimization

Before I offer my thoughts on your question I must first comment on one issue I see in your code, namely the use of bind params. Apologies if you already know this, but..

SQL queries should follow this form:

my $sth = $dbh->prepare('update data_set2 
                         set data = ? where key2 = ?');
$sth->execute($val1, $key);
[download]

This allows you to gain several benefits, including proper escaping for your data, cachability of statement handles (i.e. using the same statement handle for multiple updates, cutting time because the DB server does not have to re-parse the SQL) and you don't get people complaining at you to use bind params ;)

Now, on to your question.

I have done several different styles of what you are trying to do, and the highest performance I have been able to get out of the process is by putting more intelligence in the SQL and less in the perl. If you can make the database do more work, the perl has to work with less and will therefore run faster. If you do not have a database that supports subselects and joins, this may be harder then it otherwise would be.

For instance, your example looks like 2 problems.

creating new records for non existing
updating records that already exist

The first can be handled by getting a list from the database of only those records that don't exist, i.e.

SELECT id_field, field_to_update 
    FROM table_1 WHERE id_field NOT IN
    (SELECT join_id_field FROM table_2)
[download]

and the second can be handled with slightly more complicated SQL, like this:

SELECT src.id_field, src.field_to_update
  FROM table_1 src, table_2 dest 
  WHERE src.id_field = dest.join_id_field 
  AND src.field_to_update != dest.field_to_be_updated
[download]

This will get you a list of the records that need to be updated.

To take this a step further, if your database supports it you can even do the updates purely on the database side, although that query is much more difficult and I do not have time or inclination to figure it out for an example problem :)

Hope that helps

Comment on Re: PERL DB Optimization Select or Download Code