Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I'm looking for any guides on how to escape SQL correctly in a program. What I'd really like is something like placeholders, where I could do

$escaped_sql = escape_with_placeholders( "INSERT INTO foo VALUES ( ?, +?, ? )", $a, $b, $c );

The database I'm using is Postgres, although I'd like to stay as agnostic as possible. I'm sure DBI has some way to do this, but I can't work out how I would go about creating a DBI object without a database to connect to, or how I would intercept the escaped SQL. The data I'll be needing to escape includes random bytes than can span the entire 0-255 range, and it needs to make it into the database without error or corruption.

Edited by planetscape - tweaked formatting for readability

Replies are listed 'Best First'.
Re: Escaping SQL correctly
by rhesa (Vicar) on Mar 08, 2006 at 01:24 UTC
Re: Escaping SQL correctly
by friedo (Prior) on Mar 08, 2006 at 01:26 UTC

      And I'd love to use them, but I don't see how I can get a DBI handle without connecting to a database. My program won't have access to the database when it runs.

      If there's no other way to do it, I suppose I could install a dummy database with no data in it for DBI to connect to, but I'd still need to get access to the SQL generated as a string instead of sending it to the database.

Re: Escaping SQL correctly
by jZed (Prior) on Mar 08, 2006 at 01:35 UTC

    > I'm sure DBI has some way to do this

    Yes, they're called placeholders. See the DBI docs for prepare() and execute(). Or withouth placeholders see the quote() method.

    > but I can't work out how I would go about creating a DBI object without a database to connect to

    I'm not really sure what you mean. Don't you want to connect to a database and send the escaped SQL to it? If you just want to run DBI and DBD::Pg without connecting to a specific datbase, you can connect to one of the system databases (see the Pg docs for their names).

    > or how I would intercept the escaped SQL.

    Again, I don't know what you mean by intercept, but my $escaped_value = $dbh->quote($unescaped_value) should do the escaping.

      Don't you want to connect to a database and send the escaped SQL to it?

      No.

      I need to generate correctly-formed SQL statements to use in a report, which will then be reviewed by human eyes before entered in a database.

      That quote method looks promising, but I'd still need a handle. Is there any way to construct one without actually creating a database connection?

        Well, as I mentioned you can connect to one of the Pg system tables if you have access to Pg at all. If you don't, you may actually be able to do something like this (note no DBI or connection):
        use DBD::Pg; my $foo = q{foo'bar'baz}; print DBD::Pg::db->quote($foo); # prints 'foo''bar''bas'
Re: Escaping SQL correctly
by graff (Chancellor) on Mar 08, 2006 at 01:44 UTC
    If your script is not really connecting to a database, why are you creating sql? Will you be saving the sql statements as text to some output file, for execution later?

    I really don't understand why you would want to use DBI "without a database to connect to". Please explain that more clearly.

      It's a security requirement for the system. No SQL can be run against the database unless it's reviewed by a human first. This is fine, because we need to run less than one query a day, but the queries are long and complicated and we could save a significant amount of time (and, hence, money) by using a script to generate them, then just have a human sanity check them.
        Hmm. A human is going to review an sql statement that "...includes random bytes than can span the entire 0-255 range." That sounds like a fascinating challenge. But the more I think about, the more it sounds like the human won't really be able to look at such an sql statement the same way that a DBI database handle would.

        If the goal is to provide for sensible manual validation of sql transactions before they are executed, I would look for a method that involves printing the sql statement in a manner similar to what you would prepare via DBI: print the sql syntax legibly (with line feeds and other whitespace as appropriate), and suitable placeholders for data values -- but maybe something more detailed than just "?" -- then print the list of data values (maybe in escaped form, if that's relevant, but the formatting would depend on whether the human should be validating the data values as well as the sql syntax). For example:

        update big_table set easy_column=VAR1, messy_column=VAR2 where index_column=VAR3 VAR1='easy_data_value' VAR2='%01f%F8%F6-b$E5r%04' VAR3='123'
        I'm assuming that after the human has approved the proposed statement (and data), the method for actually executing the statement will be programmatic, and based directly on the data that the human reviewed and approved. The sort of review format suggested above could satisfy that sort approach.
        Once the basic template of your SQL-query is agreed upon and approved by whatever security-chief needs to sign-off on it, all you have to check is the data which is being entered. The usual placeholders and their implied quoting rules should take care of avoiding anything bad happening to your database (such as random SQL-injection attacks). The problem is IMHO probably more with the data than its quoting.

        There is only so much paranoia you can live with: if you cannot trust the placeholder and quoting rules of DBI/DBD (a tried and trusted set of modules), you can as well ask to check and approve the raw stream of bytes going through your sockets, pipes or TCP/IP stack to the database. Who knows wether or not someone has recompiled your modules or Perl to inject something into these streams?

        CountZero

        "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

Re: Escaping SQL correctly
by Your Mother (Archbishop) on Mar 08, 2006 at 08:21 UTC

    Now I just need the answer to "Escaping SQL entirely."

Re: Escaping SQL correctly
by pajout (Curate) on Mar 08, 2006 at 12:31 UTC
    Reading previous comments, I have following ideas:
    1/ Dump structure of your pg database and create another instance with equal structure, but without data.
    pg_dump -s dbname > dbname.dump createdb dbname_fake psql dbname_fake \i dbname.dump
    2/ Test the queries against this mockup database instance. It gives you ability to check the semantically wrong queries - relation xyz does not exist, for instance, not only syntactically erroneous queries.
    3/ I think that the tested queries would have the structure similar to
    <sql>select * from xyz where col1 = ?;</sql> <params> <param>123.5</param> </params>
    ...and you will not have any problem with escaping, I hope :)

    update: The commands of point 1/ ignore potential changes of database instance ownership and/or access rights, it is up to you to set these ...