PerlAcolyte has asked for the wisdom of the Perl Monks concerning the following question:

The following code works fine on my local machine, but it inserts duplicate records on my shared server. I am starting with a CSV file that contains 40,000 records, of which I only want to insert 7,000--my stored procedure decides if a record gets inserted or not. The server is usually inserting about 10,000 records, 3,000 of which are duplicates. I am calling the perl script as a cgi script. This is a mystery to me. (Local machine: Vista/MS SQL 2k5/ODBC DSN, Server: MS Server 2K3/MS SQL 2K5/ODBC DSN). Thanks for your help.
use DBI; my $dbh = DBI->connect( "$dbServer", "$dbUser", "$dbPass" ) or print " +Could not make connection to database"; open(DAILY, "$updateDir/$fileName") || print "CAN'T OPEN DAILY"; while(<DAILY>){ chomp; my ($field1,$field2,$field3,$field3,$field4,$field5,$field6,$field +7,$field8) = split(/,/); if ($field1 == 33){ my $spQuery = "exec my_sp ?,?,?,?,?,?,?"; $insert33 = $dbh->prepare( $spQuery ) or die ( "Cannot + prepare statement: ", $dbh->errstr(), "\n" ); $insert33->execute($field1,$field2,$field3,$field4,$fi +eld5,$field6,$field7) or die( "Cannot execute statement: ", $insert33->errst +r(), "\n" ); $insert33->finish(); } } close(DAILY);
code for stored proc
@field1 AS INTEGER, @field2 AS DATETIME, @field3 AS MONEY, @field4 AS MONEY, @field5 AS MONEY, @field6 AS MONEY, @field7 AS INTEGER AS BEGIN DECLARE @check1 INT SET @check1 = 0 SELECT @check1 = col1 FROM tableX WHERE col3 = @field1 IF @check1 > 0 BEGIN INSERT INTO priceHistory (col1,col2,col3,col4,col5,col6,col7) VALUES (@field1,@field2,@field3,@field4,@field5,@field6,@field7) END END

Replies are listed 'Best First'.
Re: DBI Inserting Duplicate Records
by graff (Chancellor) on Dec 30, 2007 at 00:38 UTC
    First, a slightly off-topic suggestion: put the "prepare()" statement outside (above) the while loop -- one of the advantages of using placeholders is "prepare once, execute often", because this saves a lot of overhead on the DB server side. (This would also involve putting the "finish()" call after the loop.)

    Getting back on topic, is it possible that the table definition on the remote server is different from the one on your local machine? E.g. maybe your local version has a uniqueness constraint that isn't being applied on the remote server's version?

    Another issue, since the stored procedure controls insertion by querying for values of "col1 FROM tableX WHERE col3 = @field1", is whether this other table on the remote server contains the same data as your local version.

    A third thing to check is whether the input data files being used locally vs. remotely really are the same. Could the remote server be getting a "longer" version of the CSV file? (Who knows how it might have gotten "longer"... still it's worth checking to be sure.)

    I don't see anything in the perl code that would lead to different behaviors depending on local vs. remote (CGI?) runtime environments. The differences are most likely a matter of table definitions or data.

Re: DBI Inserting Duplicate Records
by bradcathey (Prior) on Dec 29, 2007 at 18:18 UTC

    Maybe I'm missing something, but what does your stored procedure look like?


    —Brad
    "The important work of moving the world forward does not wait to be done by perfect men." George Eliot
      stored proc code posted. thanks again.
Re: DBI Inserting Duplicate Records
by almut (Canon) on Dec 30, 2007 at 00:25 UTC

    (Please take this with a grain of salt — I have virtually no experience with MS SQL ... I'm just replying in the hope that any suggestion might be better than none :)

    ...my stored procedure decides if a record gets inserted or not.

    How is tableX related to priceHistory, i.e. how does the result of the SELECT change, once a record has been inserted into priceHistory? In other words, how does tableX help to distinguish if inserting a certain record would produce a duplicate?

    Anyhow, sometimes, mysterious DB behaviour is the result of commits not happening when you think they should... I have no clear idea, though, how that would account for the difference between your local and the server machine, but maybe some different AutoCommit default setting of the ODBC driver, or some such. In the absence of any better ideas, I'd try to commit myself, or explicitly enable AutoCommit.  Just a thought.