http://qs1969.pair.com?node_id=340425

muba has asked for the wisdom of the Perl Monks concerning the following question:

Reading this node, a question rose in my mind. Before I joined up at PM, I didn't use warnings; use strict; either. I did, however, know what references are and how to use them. I did not know what taints are, and I still don't.

So, what are/is taints/tainting?

Replies are listed 'Best First'.
Re: So, now what are taints?
by adrianh (Chancellor) on Mar 28, 2004 at 19:59 UTC

    Tainting is a way of marking information that might be a potential security issue (e.g. provided by the user, can be overridden by a third party, etc.). Trying to use tainted data causes your program to die - preventing any potential damage.

    See perlsec for full details.

Re: So, now what are taints?
by biosysadmin (Deacon) on Mar 28, 2004 at 20:03 UTC
    It's a mode used to keep you honest when accepting input from potentially unsafe sources, such as CGI scripts. It's a good thing for certain applications, but you should definitely understand what it's doing before using it. Using warnings and strict are more general pragmas that could reasonably be applied to all of your programs, taint checking has a more specific application.

    See the perlsec manpage for more information.

Re: So, now what are taints?
by cLive ;-) (Prior) on Mar 28, 2004 at 22:38 UTC
    It's a way of (hopefully) stopping you from making silly mistakes.

    Every piece of data that comes to the script that is used outside the script is considered tainted unless you explicitly grab it from a regular expression (I think, there may be other ways to untaint though).

    Why is this useful? Let's say you had a script that uploaded a domain from a web page and you wanted to ping that domain.

    my $q = CGI->new(); my $domain = $q->param('domain'); my $result = `ping $domain`;
    Under taint, this would die because you're trying to pipe some untainted data to an external program. Imagine what would happen if some malicious user uploaded "localhost; rm -rf /" as the domain name!

    So, under taint, you would need to explicitly grab the domain from the variable:

    my $domain=''; $q->param('domain') =~ /^([a-zA-Z0-9\.]+)$/ and $domain = $1;
    That's just a rough expression to grab the domain. The point is that you know that there won't be anything malicious in $domain when it's assigned.

    But, untainting data in itself does not protect you. You could, if you wished, untaint it like this:

    $q->param('domain') =~ /^(.+)$/ and $domain = $1;
    but you won't have added to your security understanding if you do :) There are times though, when you don't care what a value contains and, in those instances, it would be perfectly acceptable to untaint like that. Just as long as you know for sure!

    I wrote a little article on it here if you're interested.

    .02

    cLive ;-)

      You are forgetting not mentioning that there are actualy two pieces of data there that need untainting: one is the domain parameter obtained from the CGI, but the other is the PATH of your program. If you use backticks like that, and don't set up your PATH explicitly, perl -T will complain.

      That appears not to make sense in a CGI environment, but it makes a lot of sense when you're writing setuid root scripts that can be run from the command line.

        Indeed. Sorry - but it is mentioned in the article I linked to that I wrote on untainting :)

        cLive ;-)

Re: So, now what are taints?
by graff (Chancellor) on Mar 29, 2004 at 02:32 UTC
    It might clarify the idea if you rephrase the question as "what is tainted data?" Data should be considered "tainted" when it comes from a source outside the direct control and complete trust of your system and perl script, such as parameters being submitted on a web form from a remote client, or data coming through a port or socket from a remote host, or data in a file that has global write access.

    "Taint" mode in perl is a method of making sure that "tainted data" is officially "quarantined", and is not allowed to be involved in any operation where it could cause damage (whether malicious or simply accidental), such as being used as part of a command line for a sub-shell, or part of an SQL statement passed to a database server, or executed as part of an "eval" block.

Re: So, now what are taints?
by muba (Priest) on Mar 29, 2004 at 10:11 UTC
    Well... I could reply a ´thank you´ to all given answers but instead, I do it all at once. Combining all the answers and the perlsec manpage, I now also understand what tainting is and why it (is|may be) use ful.

    So, thanks!
      Yes, thank you for not replying to everyone. It is very annoying. Next time, please consider using the Chatterbox for this or Super Search. This question has been asked at least 100 billion times now.