in reply to Form, Input, Taint related

CGI is the way to go. It's the only way to go. There are some smaller versions that do much of what CGI.pm does, but CGI is just fine in it's full form.

There's a lot to the discussion, but in a nutshell tainting is not really necessary if you are just collecting input and writing it to a database. One assumes that your normal validation of the input, e.g., did they enter a number when they should have entered letter, will protect what goes into the database. In fact, setting the -T switch will have no effect on this kind of input that stays 'inside the system.'

However, it's a must if you are using it to do something that involves file or directory manipulation, shells and the like. That is the only time I have found that the -T switch will scream if you neglect to untaint. Here's a sample untaint:

my $url=~ /(http://www.[\w-.]+)/; my $untainted_url = $1;

Make sure that you really do test the value for what it should be and don't cheat with something ineffectual like:

my $url=~ /([\w-.]+)/; my $untainted_url = $1;

Update: Forgot about Ovid's great little node "Use CGI or die;" comparing CGI to other methods.


—Brad
"The important work of moving the world forward does not wait to be done by perfect men." George Eliot

Replies are listed 'Best First'.
Re^2: Form, Input, Taint related
by jhourcle (Prior) on Apr 11, 2005 at 12:31 UTC

    CGI.pm is probably the best way to get at the form data, especially if you're new to perl and/or CGI programming.

    That being said, I don't like CGI.pm because it does too much work -- it both handles receiving input, and creating HTML through a whole bunch of functions that pollute your namespace. I don't like that DISABLE_UPLOADS and POST_MAX aren't set by default, and the note that they're a security risk is buried in the documentation. (see Ovid's CGI::Safe)

    I wish that Lincoln Stein would split up the HTML generating bits, and the CGI handling into seperate modules, so that I could just load the part that I want to use, without needing to resort to other modules (CGI::Lite, CGI::Base, etc.) or hacking at it myself and keeping it in sync with updates.

      Thanks for the replies!

      Well instead of CGI.pm, as of now I'm using this example I got from a web site.

      sub startup { $query=$ENV{'QUERY_STRING'}; if ($query) { @pairs=split(/&/,$query); } else { read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'}); @pairs = split(/&/, $buffer); } foreach $pair (@pairs) { $something_in=1; ($name, $value) = split(/=/, $pair); $value =~ tr/+/ /; $value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg; $name =~ tr/+/ /; $name =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg; if ($INPUT{$name}) { $INPUT{$name} = $INPUT{$name}.",".$value; + } else { $INPUT{$name} = $value; } } } #then when I want to retrieve a value for the input field called "name +", then I will use: &startup; print "$INPUT{'name'}
      This is the other alternative I was talking about. Is this as safe as using CGI.pm Input method?

      Thanks!

        The biggest issue that I see is that you're trusting the CONTENT_LENGTH header, and not placing any restriction, to make sure that someone doesn't claim it to be 50GB or some other excessive number.

        It's possible that because the values weren't being tested for taint, that they might cause other problems, but I don't know how you're using the data. If you're just printing to a log (that you're viewing with only a text editor), or report, or whatever, you might be just fine with the rest of it. If you use input as the basis for something that results in a filename, system call, database call, e-mail, or anything else that can be abused, you might want to rethink how values are being parsed.

        Although there's a bit of overhead from CGI.pm, because of all of the HTML generation bits, that you won't be using, it can provide for more robust input handling, and the ability to fix things in one spot, rather than spread across every file that handles CGI input. Make sure you look at the notes in CGI::Safe, though.