Removing malicious HTML entities (now with more questions!)

Lawliet has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.

Re: Removing malicious HTML entities (now with more questions!)
by zentara (Cardinal) on Aug 16, 2008 at 14:16 UTC

<voice of doom and gloom>

See Security techniques every programmer should know for a good overview of cgi security problems.

If you really want to be sure of your cgi security, you will need to run your own server. All the people with root access on your hosting service, can read(and temporarily modify) your script, not to mention government people who now legally can inspect your operation (part of the anti-terror stuff). Do you really trust all those people?

Thats why web-store farms are becoming so popular. Why take the risk yourself to handle all those cc numbers and private info, when yahoo or someone, will do the scripting for you, and has a bank of lawyers to defend themselves when things go wrong.

The sad fact is the people running the OS on your hosting server, control your security, by being diligent about applying security patches, screening employees with physical (and root) access to the server(s).

All you can do, is take standard precautons, like filtering NULL bytes, avoid using world-writable files and directories, never allow user-priviledge escalation, using ssl where passwords and private info is passed, etc. That is called "due diligence" in legalese... and means you won't be held negligent if things go South. Protect yourself.

Think about what would happen if your database files get stolen. People will blame you, you will blame the server operator for lax security, and it will all get complicated fast. Almost all of the time, the exact hole will never be proven, and it will get blamed on some truck driver for losing a box of backup tapes.

The government, who is supposedly fanatic about secrecy( at least certain departments), will have the servers locked in rooms, under constant video surveillance, and electromagnetically shielded. You mean your hosting service dosn't have that? Oh.... you are wide open to the right people.

</voice of doom and gloom>

I'm not really a human, but I play one on earth Remember How Lucky You Are

[reply]

Re: Removing malicious HTML entities (now with more questions!)
by dHarry (Abbot) on Aug 16, 2008 at 12:09 UTC

How safe do you want it to be? For example if you use hhtp (instead of https) the password will be send unencrypted over the internet. Not particular safe :-) It depends on your requirements.

With respect to your last question, if somebody can read the file he can obviously intercept the credentials. You have to think about file permissions and where to put what file.

See for example CGI Programming with Perl, 2nd Edition, Chapter 8 Security .

[reply]

Re^2: Removing malicious HTML entities (now with more questions!)

by Lawliet (Curate) on Aug 16, 2008 at 12:19 UTC

There are no passwords. By 'safe', I meant 'unable to be exploited' (leading to me replacing html markup).

Regarding the interception discussion, what methods could the user use? The only thing I can think of is downloading the cgi file through the use of wget (or anything, really). Then open and read.

Update: Nevermind, that method does not work. It downloads the html the cgi file outputs. But what other ways were you referring to?

I'm so adjective, I verb nouns!

chomp; # nom nom nom

[reply]

Re^3: Removing malicious HTML entities (now with more questions!)

by dHarry (Abbot) on Aug 16, 2008 at 12:26 UTC

Hacking CGI

[reply]

Re^4: Removing malicious HTML entities (now with more questions!)

by Lawliet (Curate) on Aug 16, 2008 at 12:31 UTC

Re: Removing malicious HTML entities (now with more questions!)
by graff (Chancellor) on Aug 16, 2008 at 17:25 UTC

Update2: Taint-mode has been brought to my attention. It seems like an excellent way to secure user input. Should it be used in conjunction with the other methods suggested in this node (and comments), or is it good enough by itself?

Taint mode is simply a means for making sure that you actually do use "the other methods suggested." All it does, really, is cause your script to die if/when it tries to do anything it shouldn't do with untrusted data. If you haven't used it yet, but your script is already written in a fully secure way, adding "-T" on the shebang line will make no difference.

If you have forgotten to cover any vulnerabilities, or if you later modify the script and accidentally introduce a vulnerability, having "-T" on the shebang line will make a difference: the script will die with an error message about the nature of the problem.

The one big problem with "-T" is that it can be remarkably easy to disable its usefulness as a safety device, simply by taking inappropriate steps to "untaint" your untrusted data.

Consider the following script, which is potentially quite dangerous to run (so don't use it at all if you don't understand what the risks are):

#!/usr/bin/perl -T

use strict;
use warnings;

$ENV{PATH}="/bin";

while (<>) {
    chomp;
    my $str = '';
    if ( /(.+)/ ) {
        $str = $1;
    }
    system( "echo $str" );
}
[download]

[reply]
[d/l]

Re^2: Removing malicious HTML entities (now with more questions!)

by Lawliet (Curate) on Aug 16, 2008 at 19:04 UTC

"If you later modify the script and accidentally introduce a vulnerability, having "-T" on the shebang line will make a difference"

That is why I plan on using it ^.^

I'm so adjective, I verb nouns!

chomp; # nom nom nom

[reply]

Re^3: Removing malicious HTML entities (now with more questions!)

by LesleyB (Friar) on Aug 17, 2008 at 23:31 UTC

You should always plan to use it with CGI scripts

The trick to untaint data, as far as I am aware, is to ensure your data is correct . i.e. do data validation. Usually this means using (tight) regexps to ensure the user input doesn't go outside expected bounds.

From what I have read, if you are entering anything into a db then you might want to SQL-escape it too so that people can't hijack your database and delete everything.

HTML::Entites will help display stuff that might otherwise break your web page - what's left that can beak your db?

[reply]

Re^4: Removing malicious HTML entities (now with more questions!)

by Lawliet (Curate) on Aug 18, 2008 at 02:28 UTC

Re^5: Removing malicious HTML entities (now with more questions!)

by LesleyB (Friar) on Aug 18, 2008 at 10:56 UTC

Re^4: Removing malicious HTML entities (now with more questions!)

by techcode (Hermit) on Aug 19, 2008 at 21:53 UTC

Re^5: Removing malicious HTML entities (now with more questions!)

by LesleyB (Friar) on Aug 20, 2008 at 09:35 UTC

Some notes below your chosen depth have not been shown here

Re: Removing malicious HTML entities (now with more questions!)
by jettero (Monsignor) on Aug 16, 2008 at 12:42 UTC

should

For an example: see perlmonks. Below the input box is the list of tags that will work. The reason they do this is simple: who knows what might cause harm? But we can be reasonably certain the strong and emphasis tags are ok.

-Paul

[reply]

Re^2: Removing malicious HTML entities (now with more questions!)

by Lawliet (Curate) on Aug 16, 2008 at 12:45 UTC

dHarry's link suggested that as well. Thanks for the second (and third) opinion.

I'm so adjective, I verb nouns!

chomp; # nom nom nom

[reply]

Re: Removing malicious HTML entities (now with more questions!)
by starX (Chaplain) on Aug 16, 2008 at 12:24 UTC

dHarry

CPAN

--starX
www.axisoftime.com

[reply]

Re: Removing malicious HTML entities (now with more questions!)
by Krambambuli (Curate) on Aug 16, 2008 at 14:21 UTC

Is there anyway for someone to inspect the CGI script itself, bypassing the HTML it generates?

not

Krambambuli
---

[reply]

Re^2: Removing malicious HTML entities (now with more questions!)

by zentara (Cardinal) on Aug 16, 2008 at 14:33 UTC

use CGI::Carp qw(fatalsToBrowser);
die "Bad error here";
[download]

I'm not really a human, but I play one on earth Remember How Lucky You Are

[reply]
[d/l]

Re^2: Removing malicious HTML entities (now with more questions!)

by Lawliet (Curate) on Aug 16, 2008 at 14:35 UTC

If I put it in an external file and then opened the file in the cgi script, couldn't the perpetrator see the filepath and navigate there?

I'm so adjective, I verb nouns!

chomp; # nom nom nom

[reply]

Re^3: Removing malicious HTML entities (now with more questions!)

by sasdrtx (Friar) on Aug 16, 2008 at 15:06 UTC

sas

[reply]

Re^4: Removing malicious HTML entities (now with more questions!)

by Lawliet (Curate) on Aug 16, 2008 at 15:25 UTC

Re^5: Removing malicious HTML entities (now with more questions!)

by sasdrtx (Friar) on Aug 16, 2008 at 15:51 UTC

Re^5: Removing malicious HTML entities (now with more questions!)

by Perlbotics (Archbishop) on Aug 16, 2008 at 18:15 UTC

Re^5: Removing malicious HTML entities (now with more questions!)

by ikegami (Patriarch) on Aug 22, 2008 at 04:31 UTC

Re: Removing malicious HTML entities (now with more questions!)
by Jenda (Abbot) on Aug 16, 2008 at 14:26 UTC

The encode_entities() should be enough. That is if you insert the value into places like <p>HERE</p> or <input type="text" value="HERE">. If on the other hand you use it in <script>alert('HERE');</script> it's escaped wrong. Likewise in this case: <a href="page.pl?value=HERE"> or just <a href="HERE">.

Ad Update1: There should not be, but there had been errors in web servers that allowed things like this. It's safer to store the credentials in a different file outside the directories accessible by HTTP.

Jenda
Support Denmark!
Defend the free world!

[reply]
[d/l]
[select]

Re^2: Removing malicious HTML entities (now with more questions!)

by techcode (Hermit) on Aug 16, 2008 at 17:38 UTC

sub form {
    my $self = shift;
    my %params = @_;
    
    my $skip = array_to_hash($params{'skip_fields'}); # Array/ArrayRef
    
    my $q = $self->query();
    my %vars = $q->Vars();
    
    unless($params{dont_encode_fields}){
        use HTML::Entities;
    
        foreach(keys %vars){
            next if $skip->{$_}; # Don't encode if it's in skip list 
            $vars{$_} = HTML::Entities::encode_entities($vars{$_}, '<>
+&"');
        }    
    }
    
    return \%vars;
}
[download]

Have you tried freelancing? Check out Scriptlance - I work there. For more info about Scriptlance and freelancing in general check out my home node.

[reply]
[d/l]

Re^3: Removing malicious HTML entities (now with more questions!)

by Jenda (Abbot) on Aug 16, 2008 at 21:16 UTC

I don't think it's a good idea to escape the values upon reading them. What if you are gonna need them raw? What if you're gonna need them URL escaped or escaped for inclusion in a JavaScript string literal or or or or.

Besides not all data will come into your script from the form/query so you'll have to either escape everything, no matter where it comes from or keep track of what is and what is not escaped.

Escape before you output, not when you input. Because only at the output do you really know how are you going to know how do you need to escape.

Jenda
Support Denmark!
Defend the free world!

[reply]

Re^4: Removing malicious HTML entities (now with more questions!)

by techcode (Hermit) on Aug 17, 2008 at 02:35 UTC