kyle has asked for the wisdom of the Perl Monks concerning the following question:

Say I have a variable, $suspicious, and I want to know if at some point someone said $suspicious = $dbh->quote( $clearly_tainted ) or if, sadly, $suspicous = $clearly_tainted.

The DBI documentation says that quote() may legally "return an SQL expression that evaluates to the desired string", so I can't simply test whether $suspicious "starts and ends with quotes and has all quotes inside escaped". Not only that, different databases may need things quoted differently anyway. I might also expect a special case for "NULL", which is a legal value without the quotes.

I suspect I may have to look into the DBD code for the particular database I'm using and write a way to detect whether a particular string could have come from it. This will break, however, when I'm connected to a different database.

(The XY Problem here, if you're curious, is that I may inherit some code that does not consistently quote() and use placeholders. This may lead me to do a lot of work tracing a variable from its use in some SQL back to its origin. I'm wondering if I can instead just put a sanity check in the place where it gets used. I don't actually have this problem now, but I'm trying to think ahead.)

I offer advanced thanks for your advanced wisdom.

Replies are listed 'Best First'.
Re: Can I tell if DBI's quote() was used?
by Joost (Canon) on Apr 06, 2008 at 15:49 UTC

      I love this idea! The object it returns could (hopefully) track its own usage and dump that information for debugging. That is, use caller to provide a log:

      1. File and line where the object was created.
      2. File and line for each place it was stringified.

      It could dump this on destruction or continuously during its lifetime. I wonder if it could detect when it was about to be double quoted. This seems useful enough that I wonder if it already exists.

        I wonder if it could detect when it was about to be double quoted. This seems useful enough that I wonder if it already exists.

        If you override overload stringification, you could log any call to "". I expect that would be more or less enough information. Keep in mind that you actually want to detect the strings that haven't been quote()ed, though :-)

Re: Can I tell if DBI's quote() was used?
by ikegami (Patriarch) on Apr 06, 2008 at 09:16 UTC

    No, it's not possible. Specifically,

    • It's possible to tell if some strings need quoting.
    • It's not possible to tell if any string needs quoting.
    • It's not possible to tell if a string has already been quoted.

    For example, foo bar clearly needs quoting, but "foo bar" could be foo bar already quoted or "foo bar" that hasn't been quoted.

    quote should only be used where a string literal is expected, so NULL is indeed a special case. Then again, it should *already* be a special case cause string literals are usually used with "=" and NULL is usually used with "IS". See the "NULL Values" subsection of the "Placeholders and Bind Values" section of the DBI docs.

    You should be more worried about numbers than about NULL.

      "foo bar" could be foo bar already quoted or "foo bar" that hasn't been quoted.

      That's a good point I hadn't thought of. I'm not sure how much I'm worried about that, though. If SQL will consider it a literal (and not a syntax error or someone's SQL injection attack), that may be close enough. Then again, maybe not.

      You should be more worried about numbers than about NULL.

      Why's that?

        Because NULL must already be handled specially. (Then again, giving the problem you have, that might be assuming too much.) OTOH, one needs to know whether the SQL statement is expecting a string literal or a numerical literal to know whether 5 should be 5 or "5". Or maybe 5 will always do, but you still need to special case that.
Re: Can I tell if DBI's quote() was used?
by Cabrion (Friar) on Apr 06, 2008 at 10:45 UTC

    I agree with the others, you cannot really do what you are asking. I can't give you a silver bullet, but I can offer a process that might help.

    I had a similar situation with a large amount of code that inconsistently referenced various schemas. Sometimes it was hard coded in the connect string, sometimes it was in a config file, sometimes it was hard coded in the SQL statements themselves, etc...

    My approach was to build a bunch of regular expressions to filter the source code and pinpoint likely problem areas. In my case, I left all the legacy code in place so as not to break anything, then used the regex to change the schema between development and production database instances. Anything I found that I couldn't reliably manipulate with a regex got immediate priority for refactoring. It was a bit ugly, but it worked really, really well.

    I literally started with regex's something like this:

    /dbi\W+connect\(/i # grab every DBI connection call /(select)|(update)|(delete)/i # grab every thing that looks like SQL

    Then I eyeballed the output for problem areas and patterns, refined the regex's to ignore some things & detect others, and repeated. Even though it was a large code base it didn't take too long to zero in on the true problem areas.

    You can scan a lot of code very quickly this way, and that's basically what you will have to do. Using perl to scan perl source is really effective, and I'm sure there are modules out there to help.

Re: Can I tell if DBI's quote() was used?
by stiller (Friar) on Apr 06, 2008 at 08:53 UTC
    Hm, "always use placeholder(s)" would make for a really useful policy in perlcritic if it get's written...
Re: Can I tell if DBI's quote() was used?
by bart (Canon) on Apr 08, 2008 at 07:46 UTC
    Could you... if only for debugging, override quote to return an object instead of just a string, which stringifies to the originally quoted string, and for which you test in either use in code that uses placeholders, depends on quoted strings, or gets quoted again?

    Perhaps you could use pre and post hooks to modify DBI's quote's behaviour: a pre hook to check the parameters, a post hook to wrap the string into an object. You ought to be able to attach the hooks to the code ref returned by $dbh->can('quote').

    This way you will not be able to detect (hopefully) quoted strings that get interpolated into SQL, because you'll immediately loose the object and get back a flat string. But it may be a starting place for a better test suite.