Lately, I've spotted a few examples where user's haven't been parsing input before displaying in a web page (katgirl!), allowing malicious users (ie me) to enter javascript in the inputs that then gets displayed on the web page.

Most people are aware of this, but I haven't seen a formal way of taking precautions that's globally accepted.

My thought was to create a module (CGI::Taint) that when used, overloaded print to add taint checks to items sent to it - currently, print is ignored as a factor in untainting data.

I'm not 100% sure this is the best approach, but would something like the following work?

use CGI; use CGI::Taint; my $q=CGI->new(); my $tainted_var = $q->param('form_input'); # first case - dies to browser with # "attempt to print tainted var at line 10" print $q->header. "Tainted: $tainted_var"; exit(0); # second case - no error my $untainted_var = ''; $tainted_var =~ /(\w\s+)/ and $untainted_var=$1; print $q->header. "Untainted: $untainted_var"; exit(0);

So, my questions are (I guess):

  1. is this the right approach
  2. is it feasible, given how complicted CGI.pm is?

If it could work, I don't think it would ba that much work, or are there other functions that would need overloading too? printf?

Hmmm. Thoughts welcomed.

cLive ;-)

Replies are listed 'Best First'.
Re: writing a "CGI::Taint" module
by diotalevi (Canon) on Jul 12, 2003 at 21:42 UTC

    That's easy - apply Filter::Handle to STDOUT and check to see that the data isn't tainted. die() if it happens. This is really about preventing tainted data from going to STDOUT so the name isn't great but hey, it works. Why don't you document it and submit it to CPAN?

    package CGI::Taint; use Filter::Handle 'subs'; use Taint; BEGIN { Filter \*STDOUT, sub { # Access $_[0] directly so that tainted() can test # the actual variable. if ( tainted( $_[0] ) ) { die "Tainted output could not be written to STDOUT: $_[0]" +; } $_[0] } }
Re: writing a "CGI::Taint" module
by tilly (Archbishop) on Jul 12, 2003 at 20:21 UTC
    I think that most people find it easier to do their escaping strategy on the way into the database than on output.

    This would suggest that it would be more useful to have a version of DBI that did taint checks on the way into the database than something which did them on the way out.

    Furthermore your suggested module is not at all CGI specific. You seem to be adjusting the print function to refuse to print tainted data. But print is used for all kinds of programs. Since its functionality is not specific to CGI, I would be leery of giving it a name indicating that it was.

      My point being that this is a specific case where data is output to the browser, hence CGI. Of course, there's also the issue of templating systems, so valid point about DBI storage as well.

      Perhaps then, an extension to DBI that wouldn't let you store tainted data would be useful, along with some pre-defined methods to untaint data - eg, (1) strip all html markup, (2) strip all html markup except for specified tags (a la HTML::TagFilter), (3) escape all html markup.

      use DBI; use DBI::Taint;

      Hmm, still only a germ of an idea...

      cLive ;-)

        Revisit your DBI documentation. While data tainting in DBI isn't officially finalized (per the documentation), it certainly exists right now. Also, see my other post for info on preventing tainted data from going out to STDOUT.

        DBI->connect('dbi:...', ... { Taint => 1, TaintOut => 1, TaintIn => 1 })