Ovid has asked for the wisdom of the Perl Monks concerning the following question:

Well, I better get off my duff and get this done. I have a module, CGI::Safe, that currently makes the CGI environment a bit safer by deleting certain environment variables, disabling uploads, setting max post size, etc. Current syntax is like this:

use CGI::Safe qw/ taint /; my $q = CGI::Safe->new;

This is a subclass of CGI.pm, so you can use it as both objects or functions. It's pretty much the same thing.

Having 'taint' in the import list is currently a no-op. However, in future versions, this is intended to allow most CGI scripts to run unchanged. People can specify 'taint' and allow tainted variables to be returned:

use CGI::Safe qw/ :standard taint /; my $var = param( 'var' ) || ''; ( $var ) = ( $var =~ /^([\s\w\d]+)$/ );

Without 'taint' being specified, CGI::Safe is intended to not directly return untainted data. Default "tainted" values such as undef or an empty string will be returned, instead. However, I am not sure of how to specify the untainting regexes. Perhaps I could use Untaint or CGI::Untaint for this functionality. What do you think would be a clean, easy-to-use syntax for this?

use CGI::Safe; my $q = CGI::Safe->new; # set default tainted return to empty string $q->default_tainted( '' ); # assign the regex $q->untaint( foo => qr/^([\w\s\d]+)$/ ); # will return an empty string if it doesn't untaint my $foo = $q->param( 'foo' ); if ( ! $foo ) { error_routine( $q->tainted_param( 'foo' ) ); }

Alternatively (since no implementation of CGI::param seems to take a hashref):

my $foo = param( { foo => qr/^([\w\s\d]+)$/ } );

Of course, I'd also want to provide this for cookies, but this is just a start.

What do you think is clean? What would you like to see here? Any thoughts on implementation pitfalls that I should be aware of (other than users supplying bad regexes)? I also would prefer more concise methods than "default_tainted" and "tainted_param", but I would also prefer for this to be self-documenting.

Random thoughts:

Cheers,
Ovid

Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

Replies are listed 'Best First'.
Re: CGI::Safe untaint syntax
by footpad (Abbot) on Jan 11, 2002 at 11:31 UTC

    Warning: Sleep deprivation alert.

    Provide a global untainting regex for simple forms.

    How would that be different than the standard one that everyone uses in their examples, but then tells us, "Don't use this; it's insecure. Use something something more wise in the ways of your data and what you should be allowing."?

    *snik* (A rant button is ominously triggered.)

    To my mind, it might be better to provide samples that work for *very* specific circumstances, e.g: here's one that matches:

    • a Northern American telephone number (optional area code and extention)
    • a date value following U.K. idioms as well as one using a local derived from the submitting browser's IP.
    • a non-zero, currency value with require decimal places and an optional currency symbol.
    • a general comment that allows standard, conversational punctation.
    • a comment that might (or might not) contain a <code> block.

    Personally, one wonders why so many example use .*? and then say, "This is a really lousy example. Don't use it." Where are the examples we're supposed to learn from? What works? What are the questions to ask, answer, and such not?

    Don't get me wrong; I do understand why everyone says to "allow only what's permissible." However, I believe there is room for a discussion on learning how to determine what you really need. Now, whether this better provided with CGI::Safe, Regexp::Common, or your Security Course is up to you (and/or japhy). However, I believe there is a crying need for *someone* to say, "OK. You know you need to make a decision. Let's walk through the process in a specific example and see the types of questions we encounter. It's one thing to tell someone that .*? isn't a good untaining match; it's another to disuss the issues that teach the process of choosing a better solution.

    Do I think many people will read and learn from it? Maybe, maybe not. But, the few that do will learn and that will make it worthwhile.

    </rant>

    Sorry...that's one of the reasons why I bugged you about an update in the first place. I wanted to see examples that people I respect put their servers at risk using. Show me something that works in one place and I'll evaluate its effectiveness in another. If you tell me it works in a specific case, I can accept that. Just show me the issues to consider and how you dealt with them. (Links acceptable, too)

    --f

      I think a fine default is /(\w[-.\w]*)/. I don't see how it would lead to security problems. It might lead to a script that doesn't work because, for example, you need to accept negative numbers, or something with spaces in it, etc.

      I'd rather people default to this and then get a chance to reconsider their design when their parameters don't fit that pattern (or whether they like their design and just need a looser untaint pattern). Certainly it should be easy to set a global default. And it should be easy to set no global default so you get told if you forget to pick an untaint pattern for one of your parameters.

      Yes, I think there should be a selection of untaint patterns for common data types.

      I'd accept (compiled) regular expressions, the name of some predefined untaint pattern/routine, an array ref or hash ref of exactly what values are allowed, or a code ref for the really complex cases.

      I also think there should be an upload() method that requires you to specify the full path of the directory you want to save the file to, a maximum file size, a maximum total space to be used by the directory, and give you the option of specifying an alternate untaint pattern for the file name but defaults to something like my first example. I'd probably also have it default to binmode but let you request "text" file uploads. Eventually you might want to support allowing the user to specify a subdirectory.

              - tye (but my friends call me "Tye")
(crazyinsomniac) Re: CGI::Safe untaint syntax
by crazyinsomniac (Prior) on Jan 11, 2002 at 11:23 UTC