An inbetween step would be to allow completely safe characters, disallow or escape completely unsafe characters, and for all others just delete them from the input. That's pretty close to the right thing to do for a lot of weird input characters.

You said in your post you don't understand what you're protecting against. Here's a little bit of the flavor of SQL injection attacks.

At the most fundamental level, you're trying to prevent a statement like this:

my $sql = "SELECT * FROM table WHERE NAME=$name"
from becoming nasty if the user enters something like Scott; DELETE FROM table for their name.

So, you change that to:

my $sql = "SELECT * FROM table WHERE NAME='$name'";
That works for our simple case, but now the user can enter their name as Tom'; DELETE FROM table; SELECT * FROM table WHERE NAME='Bob, which will result in the SQL statement:
SELECT * FROM table WHERE NAME='Tom'; DELETE FROM table; SELECT * FROM table WHERE NAME='Bob'

So now you have to escape quotes, which you can do with the $dbh->quote function, or by using placeholders as others have described.

Other characters that are dangerous to your database will depend on the database, but unless your DBD driver is really crappy, $dbh->quote and placeholders should both be safe.

The remaining dangers, then, depend on what you do with the data. If you're displaying it on a Web page, you want to make sure it doesn't contain HTML tags, particular JavaScript code. If you're using it to send an email, you want to make sure it doesn't have any characters special to the mailing program (for example sending a ~ to /bin/mail, the source of the security bug in setuidperl IIRC).

One school of thought says all data in the DB should be trustworthy, so you should make sure it doesn't have anything dangerous for any application. Another school of thought says put whatever you want in the DB, and the application using it is responsible for making sure it untaints it on the way out. You need to make sure that you treat data from the DB as tainted in this case. The most paranoid school of thought says you should do both---stop characters that are likely to be dangerous from getting into the DB, and applications using the data check to make sure the data really is safe. That last one is what I usually try to do.


In reply to Re: Common untainting methods? by sgifford
in thread Common untainting methods? by Wally Hartshorn

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.