Lawliet has asked for the wisdom of the Perl Monks concerning the following question:

First of all, I would like to thank those who assisted me with this problem a few days ago in the CB. I am posting here to get a wider range of opinions on what to do.

I have a CGI form that asks the user for their username and inserts it into a database. I then retrieve it from the database and display it on the web page. I use placeholders when inserting into the database, so that takes care of that. To solve the latter issue, I am currently using the HTML::Entities module: $username = encode_entities($q->param('name')) which seems to work fine. Of course, I am not competently educated in website vulnerabilities , so I do not know if that small piece of code will protect me.

Is there something more I can do or will that line of code take care of the problem?


Update: Just thought of another question (and feel like an idiot by not knowing the answer :[). In the CGI script, I have the credentials to connect to a mysql database. Is there anyway for someone to inspect the CGI script itself, bypassing the HTML it generates?

Update2: Taint-mode has been brought to my attention. It seems like an excellent way to secure user input. Should it be used in conjunction with the other methods suggested in this node (and comments), or is it good enough by itself?

I'm so adjective, I verb nouns!

chomp; # nom nom nom

Replies are listed 'Best First'.
Re: Removing malicious HTML entities (now with more questions!)
by zentara (Cardinal) on Aug 16, 2008 at 14:16 UTC
    <voice of doom and gloom>

    See Security techniques every programmer should know for a good overview of cgi security problems.

    If you really want to be sure of your cgi security, you will need to run your own server. All the people with root access on your hosting service, can read(and temporarily modify) your script, not to mention government people who now legally can inspect your operation (part of the anti-terror stuff). Do you really trust all those people?

    Thats why web-store farms are becoming so popular. Why take the risk yourself to handle all those cc numbers and private info, when yahoo or someone, will do the scripting for you, and has a bank of lawyers to defend themselves when things go wrong.

    The sad fact is the people running the OS on your hosting server, control your security, by being diligent about applying security patches, screening employees with physical (and root) access to the server(s).

    All you can do, is take standard precautons, like filtering NULL bytes, avoid using world-writable files and directories, never allow user-priviledge escalation, using ssl where passwords and private info is passed, etc. That is called "due diligence" in legalese... and means you won't be held negligent if things go South. Protect yourself.

    Think about what would happen if your database files get stolen. People will blame you, you will blame the server operator for lax security, and it will all get complicated fast. Almost all of the time, the exact hole will never be proven, and it will get blamed on some truck driver for losing a box of backup tapes.

    The government, who is supposedly fanatic about secrecy( at least certain departments), will have the servers locked in rooms, under constant video surveillance, and electromagnetically shielded. You mean your hosting service dosn't have that? Oh.... you are wide open to the right people.

    </voice of doom and gloom>


    I'm not really a human, but I play one on earth Remember How Lucky You Are
Re: Removing malicious HTML entities (now with more questions!)
by dHarry (Abbot) on Aug 16, 2008 at 12:09 UTC

    How safe do you want it to be? For example if you use hhtp (instead of https) the password will be send unencrypted over the internet. Not particular safe :-) It depends on your requirements.

    With respect to your last question, if somebody can read the file he can obviously intercept the credentials. You have to think about file permissions and where to put what file.

    See for example CGI Programming with Perl, 2nd Edition, Chapter 8 Security .

      There are no passwords. By 'safe', I meant 'unable to be exploited' (leading to me replacing html markup).

      Regarding the interception discussion, what methods could the user use? The only thing I can think of is downloading the cgi file through the use of wget (or anything, really). Then open and read.

      Update: Nevermind, that method does not work. It downloads the html the cgi file outputs. But what other ways were you referring to?

      I'm so adjective, I verb nouns!

      chomp; # nom nom nom

        See the link I provided, and also see Hacking CGI. Just Google or Super Search the Monastery.
Re: Removing malicious HTML entities (now with more questions!)
by graff (Chancellor) on Aug 16, 2008 at 17:25 UTC
    Update2: Taint-mode has been brought to my attention. It seems like an excellent way to secure user input. Should it be used in conjunction with the other methods suggested in this node (and comments), or is it good enough by itself?

    Taint mode is simply a means for making sure that you actually do use "the other methods suggested." All it does, really, is cause your script to die if/when it tries to do anything it shouldn't do with untrusted data. If you haven't used it yet, but your script is already written in a fully secure way, adding "-T" on the shebang line will make no difference.

    If you have forgotten to cover any vulnerabilities, or if you later modify the script and accidentally introduce a vulnerability, having "-T" on the shebang line will make a difference: the script will die with an error message about the nature of the problem.

    The one big problem with "-T" is that it can be remarkably easy to disable its usefulness as a safety device, simply by taking inappropriate steps to "untaint" your untrusted data.

    Consider the following script, which is potentially quite dangerous to run (so don't use it at all if you don't understand what the risks are):

    #!/usr/bin/perl -T use strict; use warnings; $ENV{PATH}="/bin"; while (<>) { chomp; my $str = ''; if ( /(.+)/ ) { $str = $1; } system( "echo $str" ); }
    Having taint mode turned on does not stop that script from causing any given amount of damage or mischief, because the regex match, which satisfies the requirements for untainting data, does nothing at all to protect you from the bad things that could happen.
      "If you later modify the script and accidentally introduce a vulnerability, having "-T" on the shebang line will make a difference"

      That is why I plan on using it ^.^

      I'm so adjective, I verb nouns!

      chomp; # nom nom nom

        You should always plan to use it with CGI scripts

        The trick to untaint data, as far as I am aware, is to ensure your data is correct . i.e. do data validation. Usually this means using (tight) regexps to ensure the user input doesn't go outside expected bounds.

        From what I have read, if you are entering anything into a db then you might want to SQL-escape it too so that people can't hijack your database and delete everything.

        HTML::Entites will help display stuff that might otherwise break your web page - what's left that can beak your db?

Re: Removing malicious HTML entities (now with more questions!)
by jettero (Monsignor) on Aug 16, 2008 at 12:42 UTC
    The usual way to handle something like this, particularly if you don't know what will cause harm, is to select a set of tags and attributes that should be allowed and remove everything else.

    For an example: see perlmonks. Below the input box is the list of tags that will work. The reason they do this is simple: who knows what might cause harm? But we can be reasonably certain the strong and emphasis tags are ok.

    -Paul

      dHarry's link suggested that as well. Thanks for the second (and third) opinion.

      I'm so adjective, I verb nouns!

      chomp; # nom nom nom

Re: Removing malicious HTML entities (now with more questions!)
by starX (Chaplain) on Aug 16, 2008 at 12:24 UTC
    To second what dHarry says, how safe do you want it to be? You seem to be covering the basics pretty well, but if you're really worried about security beyond that, you might consider using SSL to encrypt the connection. CPAN can help you here.
Re: Removing malicious HTML entities (now with more questions!)
by Krambambuli (Curate) on Aug 16, 2008 at 14:21 UTC
    ... Is there anyway for someone to inspect the CGI script itself, bypassing the HTML it generates? ...

    That's something that depends more on the web server's security than the one of the script.

    The answer is "normally, no" - but it happens occasionally that a webserver config is altered by mistake and then CGI scripts are not rendered to HTML but handed out as plain text. Having the password[s] in the source would obviously expose them in such situations.

    To be on the safe side, you would put your passwords in an external file and just read/include that file in your script.

    Krambambuli
    ---
      One common mistake is to leave debugging output to the browser on. It can dump out some useful info to hackers.
      use CGI::Carp qw(fatalsToBrowser); die "Bad error here";
      Another common mistake, is to upload an updated cgi script, and don't name it properly or give it executable perms. If the server dosn't catch it, a non-executable script can be returned as a text file. The server config file should catch it, but are your sure? I've been surprised a couple of times to see my script displayed as a text file, because it was mode 0644.

      I'm not really a human, but I play one on earth Remember How Lucky You Are

      If I put it in an external file and then opened the file in the cgi script, couldn't the perpetrator see the filepath and navigate there?

      I'm so adjective, I verb nouns!

      chomp; # nom nom nom

        The files accessible via URL are controlled by the server's URL-to-filesystem mapping and other things. Normally, you would have very little of the server's filesystem exposed.

        sas
Re: Removing malicious HTML entities (now with more questions!)
by Jenda (Abbot) on Aug 16, 2008 at 14:26 UTC

    The encode_entities() should be enough. That is if you insert the value into places like <p>HERE</p> or <input type="text" value="HERE">. If on the other hand you use it in <script>alert('HERE');</script> it's escaped wrong. Likewise in this case: <a href="page.pl?value=HERE"> or just <a href="HERE">.

    Ad Update1: There should not be, but there had been errors in web servers that allowed things like this. It's safer to store the credentials in a different file outside the directories accessible by HTTP.

      This is how I do it - it's part of my regular CGI::Application framework. But you can also get Vars() from CGI.pm or CGI::Simple. Other stuff is to skip some or all fields (say when creating a CMS and you don't want it to mess up any HTML code inside).
      sub form { my $self = shift; my %params = @_; my $skip = array_to_hash($params{'skip_fields'}); # Array/ArrayRef my $q = $self->query(); my %vars = $q->Vars(); unless($params{dont_encode_fields}){ use HTML::Entities; foreach(keys %vars){ next if $skip->{$_}; # Don't encode if it's in skip list $vars{$_} = HTML::Entities::encode_entities($vars{$_}, '<> +&"'); } } return \%vars; }
      PS. Latter I found out about grep trick to check if variable is in the array - should change this ...

      Have you tried freelancing? Check out Scriptlance - I work there. For more info about Scriptlance and freelancing in general check out my home node.

        I don't think it's a good idea to escape the values upon reading them. What if you are gonna need them raw? What if you're gonna need them URL escaped or escaped for inclusion in a JavaScript string literal or or or or.

        Besides not all data will come into your script from the form/query so you'll have to either escape everything, no matter where it comes from or keep track of what is and what is not escaped.

        Escape before you output, not when you input. Because only at the output do you really know how are you going to know how do you need to escape.