Fast Building of SQL Statements

This is for those of us who are just too stubborn to use SQL front-ends like Class::DBI or DBIx::Recordset.

Over the last few months, I've taken to putting the fields I want to insert or select into a list placed inside a bareblock with the subroutine:

{
    my @FIELDS = qw( field1 field2 field3 );

    sub run_insert 
    {
        my $dbh = shift || return;
        my ($field1, $field2, $field3) = @_;

        my $sql = q/
            INSERT 
            ( / . 
                join(',' @FIELDS) . 
            q/)
            INTO some_table 
            VALUES 
            ( *** PLACEHOLDERS HERE ***)
        /;
        my $sth = $dbh->prepare($sql);
        $sth->execute($field1, $field2, $field3);
        $sth->finish;
    }
}
[download]

I've gone through a few different tricks for getting the placeholders into the SQL. Initialy, it was an ugly while loop which made each field into a '?', adding a comma unless it was the last element of the list.

It wasn't long before I hit myself over the head and used a join/map construct instead:

$BEGINING_SQL . 
join(',', map('?', @FIELDS)) . 
$ENDING_SQL;
[download]

Much nicer than the while loop. However, map still scales linearly, so this will be slow on large tables or when calling the subroutine sequentially.

Not knowing about how the x operator works in list context, I recently saw golf entry by BrowserUK which enlightened me to a new way of building the placeholders:

$BEGINING_SQL . 
join(',', '?' x @FIELDS) . 
$ENDING_SQL;
[download]

This method is fast, as the following benchmark shows:

#!/usr/bin/perl
use strict;
use warnings;
use Benchmark qw( cmpthese );

my $count = shift || 100000;
my @test_array = (0 .. 99);

sub map_bench 
{
    my @array = map { '?' } @test_array;
}

sub x_bench 
{
    my @array = '?' x @test_array;
}

cmpthese($count, {
    map => \&map_bench, 
    x   => \&x_bench, 
});


__OUTPUT__
Benchmark: timing 100000 iterations of map, x...
       map: 76 wallclock secs (76.75 usr +  0.00 sys = 76.75 CPU) @ 13
+02.93/s (n=100000)
         x:  1 wallclock secs ( 1.91 usr +  0.00 sys =  1.91 CPU) @ 52
+356.02/s (n=100000)
       Rate   map     x
map  1303/s    --  -98%
x   52356/s 3918%    --
[download]

This method saves a few characters, and I don't think its any less clear than the map style (a quick glance at perlop will enlighten people unfamiler with the specifics of x). Increasing @test_array's size to 1000 elments shows a small but significant slowdown in x's speed (about 30k runs per second on my dev system), while map is reduced to around 100 runs per second (about a ten-fold decrease, which should be expected since we're increasing the array's size by 10).

----
I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident.
-- Schemer

Note: All code is untested, unless otherwise stated

Comment on Fast Building of SQL Statements Select or Download Code

Replies are listed 'Best First'.
•Re: Fast Building of SQL Statements by merlyn (Sage) on Aug 27, 2003 at 17:59 UTC
I'm pretty sure you want: `('?') x @test_array;` [download] in place of `'?' x @test_array;` [download] unless you wanted `"??????"` instead of `qw(? ? ? ? ? ?)`; -- Randal L. Schwartz, Perl hacker Be sure to read my standard disclaimer if this is a reply.	[reply] [d/l] [select]
Re: Fast Building of SQL Statements by Abigail-II (Bishop) on Aug 27, 2003 at 18:02 UTC
This method is fast, as the following benchmark shows: Yeah, but it's also wrong, as the following code shows: #!/usr/bin/perl use strict; use warnings; use Benchmark qw( cmpthese ); my $count = shift \|\| 100000; my @test_array = (0 .. 99); my (@array1, @array2); sub map_bench { @array1 = map { '?' } @test_array; } sub x_bench { @array2 = '?' x @test_array; } map_bench; x_bench; print "\@test_array has ", scalar @test_array, " elements\n"; print "\@array1 has ", scalar @array1, " elements\n"; print "\@array2 has ", scalar @array2, " elements\n"; __END__ @test_array has 100 elements @array1 has 100 elements @array2 has 1 elements [download] The `x` operator will only return a multi-element list if all of the following conditions are met: The operator is in list context. The right-hand side of the operator is greater than 1. The left-hand side of the operator is surrounded by parens. And the last condition isn't met. Here's a better benchmark, `x` is still winning, but not with such a margin: #!/usr/bin/perl use strict; use warnings; use Benchmark qw /cmpthese/; our @test_array = (0 .. 99); our (@array1, @array2); cmpthese -2 => { map => '@::array1 = map {"?"} @::test_array', x => '@::array2 = ("?") x @::test_array' }; die "Benchmark failed" unless @test_array == @array1 && @test_array == @array2; __END__ Benchmark: running map, x for at least 2 CPU seconds... map: 3 wallclock secs ( 2.09 usr + 0.03 sys = 2.12 CPU) @ 46 +08.49/s (n=9770) x: 2 wallclock secs ( 2.28 usr + 0.00 sys = 2.28 CPU) @ 14 +508.33/s (n=33079) Rate map x map 4608/s -- -68% x 14508/s 215% -- [download] Abigail	[reply] [d/l] [select]
Re: Re: Fast Building of SQL Statements by zby (Vicar) on Aug 28, 2003 at 07:57 UTC
To defend his technique I slightly modified his benchmarking script: #!/usr/bin/perl use strict; use warnings; use Benchmark qw( cmpthese ); my $count = shift \|\| 100000; my @test_array = (0 .. 99); sub map_bench { my $result = join(', ', map { '?' } @test_array); } sub x_bench { my $result = '?, ' x @test_array; } cmpthese($count, { map => \&map_bench, x => \&x_bench, }); __END__ Benchmark: timing 100000 iterations of map, x... map: 28 wallclock secs (27.94 usr + 0.00 sys = 27.94 CPU) @ 35 +78.97/s (n=100000) x: 1 wallclock secs ( 0.77 usr + 0.00 sys = 0.77 CPU) @ 12 +9533.68/s (n=100000) Rate map x map 3579/s -- -97% x 129534/s 3519% -- [download] Update: Unfortunately I forgot about the first reason of using join - getting rid of the last comma. Perhaps it should be `my $result = ('?, ' x $#test_array) . '?'` [download] It's not as elegant as it was. But still simpler than `my $result = join(', ', map { '?' } @test_array);` [download]	[reply] [d/l] [select]
Re: Fast Building of SQL Statements by perrin (Chancellor) on Aug 27, 2003 at 17:24 UTC
You're optimizing the wrong thing. Try using prepare_cached instead of prepare. And try using Class::DBI. I resisted for a long time too, but it really is awfully convenient.	[reply]
Re: Re: Fast Building of SQL Statements by hardburn (Abbot) on Aug 27, 2003 at 17:34 UTC
I almost always do use `prepare_cached` (decided against it for the example above), but that won't speed up the initial generation of SQL. Also, its not like converting `map` to `x` is obfu. I actually think its slightly cleaner, and gets bonus points for being faster. ---- I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident. -- Schemer Note: All code is untested, unless otherwise stated	[reply] [d/l] [select]
Re: Fast Building of SQL Statements by dws (Chancellor) on Aug 27, 2003 at 22:00 UTC
However, map still scales linearly, so this will be slow on large tables or when calling the subroutine sequentially. The relative performance of map in this regard is barely measurable in comparison with the time it takes to execute the query. I think you're micro-optimizing in the wrong place.	[reply]
Re: Fast Building of SQL Statements by Aristotle (Chancellor) on Aug 28, 2003 at 11:45 UTC
I don't see this as an optimization so much as a style question. As others have mentioned, the query building speed is unlikely to matter. I'd prefer this: `my @FIELDS = qw( field1 field2 field3 ); my @PLACEHOLDERS = ('?') x @FIELD; my $sql = do { local $" = ','; "INSERT (@FIELDS) INTO foo VALUES (@PLACEHOLDERS)"; };` [download] WRT optimization, since you're using a closure to keep a constant list of field names inside of, what's the point of rebuilding the SQL for every query? `{ my $SQL; BEGIN { my @FIELDS = qw( field1 field2 field3 ); my @PLACEHOLDERS = ('?') x @FIELD; local $" = ','; $SQL = "INSERT (@FIELDS) INTO foo VALUES (@PLACEHOLDERS)"; } sub run_insert { my $dbh = shift \|\| return; my $sth = $dbh->prepare($SQL); $sth->execute(@_); $sth->finish; } }` [download] Makeshifts last the longest.	[reply] [d/l] [select]
Re: Re: Fast Building of SQL Statements by DapperDan (Pilgrim) on Aug 28, 2003 at 12:36 UTC
++. This is the best one, for my money. Very readable, and probably could be pulled out to a general sub which generates the `run_insert` sub for you. Nice use of a closure as well. I like this a lot as its a piece of code I have written badly over and over. (Rushing back to work after lunch, so not doing it myself (yet).)	[reply]
Re^3: Fast Building of SQL Statements by Aristotle (Chancellor) on Aug 28, 2003 at 12:59 UTC
Yep, that's easy. `sub make_insert_func { my @PLACEHOLDERS = ('?') x @_; local $" = ','; my $SQL = "INSERT (@_) INTO foo VALUES (@PLACEHOLDERS)"; return sub { my $dbh = shift \|\| return; my $sth = $dbh->prepare($SQL); $sth->execute(@_); $sth->finish; } }` [download] Makeshifts last the longest.	[reply] [d/l]
Re: Fast Building of SQL Statements by menolly (Hermit) on Aug 27, 2003 at 20:36 UTC
When you first mention using `x`, you have `join(',', '?' x @FIELDS)`. In your posted benchmark code, the `join ','` is omitted.	[reply] [d/l] [select]
Re: Re: Fast Building of SQL Statements by hardburn (Abbot) on Aug 27, 2003 at 20:45 UTC
Doesn't matter. The `join` in the code near the start is an example of placing the generated data into an SQL statement. `join` would run at the exact same rate (ignoring system noise) for the benchmarks, so there's no reason to have it in. ---- I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident. -- Schemer Note: All code is untested, unless otherwise stated	[reply] [d/l] [select]