dwalin has asked for the wisdom of the Perl Monks concerning the following question:
#!/usr/bin/perl use strict; use warnings; use 5.14.0; use Benchmark; use DBI; my $dbh = DBI->connect('dbi:SQLite::memory:', '','', { AutoCommit => 1, RaiseError => 1, PrintError +=> 0, ChopBlanks => 1 }) or die DBI->errstr; $dbh->do("CREATE TABLE test_bulk_insert (\n" . (join ",\n", map { "field$_ int" } (1..$ARGV[0])) . ");\n") o +r die $dbh->errstr; my $sth = $dbh->prepare("INSERT INTO test_bulk_insert (\n" . (join ",\n", map { "field$_" } (1..$ARGV[0])) +. ') VALUES (' . (join ",\n", map { '?' } (1..$ARGV[0])) . ')') + or die $dbh->errstr; my @newdata; push @newdata, [] for (1..$ARGV[0]); $sth->bind_param_array($_ + 1, $newdata[$_]) for keys @newdata; my $data; $data .= sprintf(("%-4d"x$ARGV[0])."\n", map { int rand 9999 } (1..$ARGV[0])) for (1..$ARGV[1]); my $template = "A4"x$ARGV[0]; timethese($ARGV[2], { 'new' => sub { my $buf = $data; my ($curr, $prev); while ($buf =~ /\n/g) { $prev = defined $curr ? $curr : 0; $curr = pos $buf; my $_ = substr $buf, $prev, $curr - $prev; chomp; my @cols = unpack $template; push $_, shift @cols for (@newdata); }; eval { $dbh->do("BEGIN IMMEDIATE TRANSACTION"); $dbh->do("DELETE FROM test_bulk_insert"); $sth->execute_array({}, @newdata); $dbh->commit; } or die $@; $#{$newdata[$_]} = -1 for keys @newdata; }, 'old' => sub { my $buf = $data; my ($curr, $prev, @batch); while ($buf =~ /\n/g) { $prev = defined $curr ? $curr : 0; $curr = pos $buf; my $_ = substr $buf, $prev, $curr - $prev; chomp; push @batch, [unpack $template]; }; eval { $dbh->do("BEGIN IMMEDIATE TRANSACTION"); $dbh->do("DELETE FROM test_bulk_insert"); $sth->execute(@$_) for @batch; $dbh->commit; } or die $@; }, }); $dbh->disconnect;
And the cause of slowness is not my code, it's DBD::SQLite. Under Devel::NYTProf it looks most peculiar:dwalin@sol:/tmp$ perl test.pl 100 1000 100 Benchmark: timing 100 iterations of new, old... new: 22 wallclock secs (22.10 usr + 0.28 sys = 22.38 CPU) @ 4 +.47/s (n=100) old: 10 wallclock secs ( 9.87 usr + 0.08 sys = 9.95 CPU) @ 10 +.05/s (n=100)
'old' line-by-line insert:
$sth->execute(@$_) for @batch; # spent 10.0ms making 10000 calls to DBI::st::execute, avg 1µs/call
'new' bulk insert:
$sth->execute_array({}, @newdata); # spent 3.17s making 10 calls to DBI::st::execute_array, avg 317ms/ca +ll # spent 3.17s making 10 calls to DBD::_::st::execute_array, avg 317ms +/call
Something in Devel::NYTProf is visibly affecting execution; 'new' code becomes ten times slower than 'old' but the picture is the same - it's slower. Why is that? I tried to look inside DBD::SQLite but I'm not advanced enough to dig deep in DBI and its drivers.
I tested it with perl-5.14.1 under Solaris 9 x86.
Regards,
Alex.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: DBD::SQLite bulk insert woes
by danb (Friar) on Jun 18, 2011 at 19:13 UTC | |
by dwalin (Monk) on Jun 18, 2011 at 19:46 UTC | |
by Marshall (Canon) on Jun 18, 2011 at 22:32 UTC | |
by dwalin (Monk) on Jun 19, 2011 at 08:40 UTC | |
by Marshall (Canon) on Sep 12, 2012 at 07:17 UTC | |
Re: DBD::SQLite bulk insert woes
by Anonymous Monk on Jun 21, 2011 at 03:36 UTC | |
by Anonymous Monk on Jun 21, 2011 at 04:03 UTC | |
by dwalin (Monk) on Jun 21, 2011 at 08:30 UTC |