Ny_Worker has asked for the wisdom of the Perl Monks concerning the following question:

howdy
I'm sort of new with using forms and validating input. I heard that you are suppose to use taint mode when dealing with such input from form filled out from the client end.

I need some basic questions answered and maybe some recommendations :)

Should I use CGI.pm to retreive values from forms? Is using CGI.pm always the best way to go? Are there any alternatives that work just as good and safe as CGI.pm?

For taint mode, I'm still looking for some good tutorials and documentation that clearly explains how I can start using it. I know that you have to use the -T switch, but just how do you taint a value? Whats the difference between a tainted and an untainted value?

I just started playing around with perl 3 months ago and at the point where I want to use forms!! I understand the concept in a way but not the measures behind it.

Thanks!!!

Replies are listed 'Best First'.
Re: Form, Input, Taint related
by bradcathey (Prior) on Apr 11, 2005 at 02:57 UTC

    CGI is the way to go. It's the only way to go. There are some smaller versions that do much of what CGI.pm does, but CGI is just fine in it's full form.

    There's a lot to the discussion, but in a nutshell tainting is not really necessary if you are just collecting input and writing it to a database. One assumes that your normal validation of the input, e.g., did they enter a number when they should have entered letter, will protect what goes into the database. In fact, setting the -T switch will have no effect on this kind of input that stays 'inside the system.'

    However, it's a must if you are using it to do something that involves file or directory manipulation, shells and the like. That is the only time I have found that the -T switch will scream if you neglect to untaint. Here's a sample untaint:

    my $url=~ /(http://www.[\w-.]+)/; my $untainted_url = $1;

    Make sure that you really do test the value for what it should be and don't cheat with something ineffectual like:

    my $url=~ /([\w-.]+)/; my $untainted_url = $1;

    Update: Forgot about Ovid's great little node "Use CGI or die;" comparing CGI to other methods.


    —Brad
    "The important work of moving the world forward does not wait to be done by perfect men." George Eliot

      CGI.pm is probably the best way to get at the form data, especially if you're new to perl and/or CGI programming.

      That being said, I don't like CGI.pm because it does too much work -- it both handles receiving input, and creating HTML through a whole bunch of functions that pollute your namespace. I don't like that DISABLE_UPLOADS and POST_MAX aren't set by default, and the note that they're a security risk is buried in the documentation. (see Ovid's CGI::Safe)

      I wish that Lincoln Stein would split up the HTML generating bits, and the CGI handling into seperate modules, so that I could just load the part that I want to use, without needing to resort to other modules (CGI::Lite, CGI::Base, etc.) or hacking at it myself and keeping it in sync with updates.

        Thanks for the replies!

        Well instead of CGI.pm, as of now I'm using this example I got from a web site.

        sub startup { $query=$ENV{'QUERY_STRING'}; if ($query) { @pairs=split(/&/,$query); } else { read(STDIN, $buffer, $ENV{'CONTENT_LENGTH'}); @pairs = split(/&/, $buffer); } foreach $pair (@pairs) { $something_in=1; ($name, $value) = split(/=/, $pair); $value =~ tr/+/ /; $value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg; $name =~ tr/+/ /; $name =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg; if ($INPUT{$name}) { $INPUT{$name} = $INPUT{$name}.",".$value; + } else { $INPUT{$name} = $value; } } } #then when I want to retrieve a value for the input field called "name +", then I will use: &startup; print "$INPUT{'name'}
        This is the other alternative I was talking about. Is this as safe as using CGI.pm Input method?

        Thanks!
Re: Form, Input, Taint related
by chas (Priest) on Apr 11, 2005 at 02:28 UTC
    The CGI module is definitely the way to go! It will unencode form data, automatically deal with GET and POST data and a multitude of other things (that are very tricky to write yourself!) You can get good info on using it from "perldoc CGI". As to taint mode, you probably can't do better than "perldoc perlsec". Someone will probably give you some examples in their reply, but you should really read the docs mentioned above; there are good examples there.
    chas
    (Update: Lincoln Stein (the creator of CGI.pm) has written a book - "Official Guide to Programming with CGI.pm" (Wiley Publ.) which is very informative and easy to read. Also, Ovid has a Web Programming Course linked from his home node which is very nice! You can probably find everything you need to know there.)
Re: Form, Input, Taint related
by tlm (Prior) on Apr 11, 2005 at 02:51 UTC

    Look at perlsec, especially the section Laundering and Detecting Tainted Data.

    As far as alternatives to CGI.pm, I don't know of one, at least for basic CGI scripting (which I take it is where you are at the moment). But why do you want an alternative? Is there some reason why you don't want to use CGI.pm? It's one of the all-time best modules out there. Just be glad it exists, and go for it.

    the lowliest monk