in reply to Removing malicious HTML entities (now with more questions!)

Update2: Taint-mode has been brought to my attention. It seems like an excellent way to secure user input. Should it be used in conjunction with the other methods suggested in this node (and comments), or is it good enough by itself?

Taint mode is simply a means for making sure that you actually do use "the other methods suggested." All it does, really, is cause your script to die if/when it tries to do anything it shouldn't do with untrusted data. If you haven't used it yet, but your script is already written in a fully secure way, adding "-T" on the shebang line will make no difference.

If you have forgotten to cover any vulnerabilities, or if you later modify the script and accidentally introduce a vulnerability, having "-T" on the shebang line will make a difference: the script will die with an error message about the nature of the problem.

The one big problem with "-T" is that it can be remarkably easy to disable its usefulness as a safety device, simply by taking inappropriate steps to "untaint" your untrusted data.

Consider the following script, which is potentially quite dangerous to run (so don't use it at all if you don't understand what the risks are):

#!/usr/bin/perl -T use strict; use warnings; $ENV{PATH}="/bin"; while (<>) { chomp; my $str = ''; if ( /(.+)/ ) { $str = $1; } system( "echo $str" ); }
Having taint mode turned on does not stop that script from causing any given amount of damage or mischief, because the regex match, which satisfies the requirements for untainting data, does nothing at all to protect you from the bad things that could happen.

Replies are listed 'Best First'.
Re^2: Removing malicious HTML entities (now with more questions!)
by Lawliet (Curate) on Aug 16, 2008 at 19:04 UTC
    "If you later modify the script and accidentally introduce a vulnerability, having "-T" on the shebang line will make a difference"

    That is why I plan on using it ^.^

    I'm so adjective, I verb nouns!

    chomp; # nom nom nom

      You should always plan to use it with CGI scripts

      The trick to untaint data, as far as I am aware, is to ensure your data is correct . i.e. do data validation. Usually this means using (tight) regexps to ensure the user input doesn't go outside expected bounds.

      From what I have read, if you are entering anything into a db then you might want to SQL-escape it too so that people can't hijack your database and delete everything.

      HTML::Entites will help display stuff that might otherwise break your web page - what's left that can beak your db?

        "If you are entering anything into a db then you might want to SQL-escape it too so that people can't hijack your database"

        By using placeholders, right?

        I'm so adjective, I verb nouns!

        chomp; # nom nom nom