MelaOS has asked for the wisdom of the Perl Monks concerning the following question:

Hi Fellow monks, i have a log file which i will extract and make it into sql line which then i insert into a MS SQL 2K database using perl.
The thing is the log file is pretty huge and can grow up to a few hundred K's of lines meaning few hundreds K's of line by line SQL insertion.
I'm trying to speed it up by appending a few hundreds of the sql line at one time and execute using $db->Sql($sql); but what i found was that this method is pretty unstable as i've tried to insert 1K of line but the result varies from 8xx to 9xx but not hitting the exact amount of 1K.

The other way i thought of was to pass the gigantic string to stored proc to execute there as i think that would be more stable but i have a limitation of 4000nvarchar which is too little space..

What would be a better way to do this? Any help would be more than appreciated!!! thanks in advance.
  • Comment on How to Do multiples SQL Insert at one time?

Replies are listed 'Best First'.
Re: How to Do multiples SQL Insert at one time?
by roboticus (Chancellor) on Jan 08, 2008 at 05:15 UTC
    A few notes:

    First, as others have mentioned, you normally don't use SQL for bulk loading. Instead, use the bulk loading tool provided for your database (e.g., BCP for Sybase / MSSQL Server).

    However, if you want to use SQL, there are a few tricks you can use to speed things up:

    • Use placeholders rather than creating the SQL scripts manually. That way, you'll save the overhead of repeatedly setting up the statement.
    • Use transactions to group inserts together. Much of the time spent in database updates is the integration of the completed transaction into the database. If you don't specify a transaction, some databases will use one for each statement you execute. Instead, start a transaction, execute multiple (I often use 500 or so) inserts, then commit the transaction. It'll go a good bit faster.
    • The best tip, if you can use it, is to drop the indexes on the table, perform the inserts, then reindex the table.
    • Finally, if you can't use the previous tip, then create a temporary table with the same columns and no indices. Your inserts should go quickly into this table. Then when you have the temporary table created, do a single insert to copy your data into the production table. Something like:
      insert FOO select * from BAR

    I hope you find this helpful...

    ...roboticus

      Hi Guys, great advice, really appreciate it.
      as i don't really need everything in perl and my main concern is speed. i will try to do the bulk insert using the BCP first. again, thanks!!
        If you are using MS SQL Server 2K another option might be to generate a Data Transformation Services (DTS) script to introduce a temp table into the database, then use that temp table for the insert. This DTS script can be saved and used from Perl. You can delete this temp table with a T-SQL command like:
        if exists(select 1 from INFORMATION_SCHEMA.tables where table_name = ' +Table_with_data') DROP TABLE Table_with_data;
      I totally agree.It is better to drop the index (modify the table to 'heap') first and then do the inserts thus avoiding the overhead of updating the indexes. Then assuming that you are reading from the log file and that each line has the following format $line="value1:value2:value3" then you can use placeholders:

      open(FILE,"logfile.txt")||die; while($line=<FILE>){ ($field1,$field2,$field3)=split(/:/,$line); $sth = $dbh->prepare ('INSERT INTO tablename (fieldname1,fieldname2,fi +eldname3) VALUES(?,?,?)'); $sth->execute ($field1,$field2,$field3); } $sth->finish ();
      as roboticus points out, it might be better to group the inserts and do a commit let's say every 500 records. you might also check for errors when executing the statement and rollback the transaction when one occurs

        I hope you don't really use code like this. In case to have DBI do bulkloading faster, you should also write faster code. That prepare should be outside the loop

        open my $log, "<", $logfile or die "$logfile: $!"; my $sth = $dbh->prepare ("insert into foo values (?,?,?)"; while (<$log>) { chomp; $sth->execute (split m/:/, $_); } $sth->finish; $dbh->commit;

        Enjoy, Have FUN! H.Merijn
Re: How to Do multiples SQL Insert at one time?
by perrin (Chancellor) on Jan 08, 2008 at 04:06 UTC
    Don't use DBI for mass inserts. All databases have some kind of bulk loading tools which are massively faster than SQL statements. Use them, and use perl to prepare the CSV file or whatever it is that the loader wants as input.
Re: How to Do multiples SQL Insert at one time?
by kyle (Abbot) on Jan 08, 2008 at 04:13 UTC
Re: How to Do multiples SQL Insert at one time?
by richardX (Pilgrim) on Jan 08, 2008 at 22:59 UTC
    As the others have said, use the SQL tools for bulk loading, including the performance issues that you can control like the database logging. Use bulk load and set BATCHSIZE to the appropriate size based upon some testing by you. http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/incbulkload.mspx

    Richard

    There are three types of people in this world, those that can count and those that cannot. Anon