You can do better than just replacing the "<" and ">" characters - as these do not prevent all attacks.
Have a look at HTML::Scrubber and HTML::Filter.
Using the first is as simple as this:
my @allow = qw[ ul li ol p br hr b a i pre blockquote tt dl dd dt ];
my @rules = (
script => 0,
img => {
src => qr{^(http://)}i, # only absolute image links allowed
alt => 1, # alt attribute allowed
'*' => 0, # deny all other attributes
},
a => {
href => 1, # HREF
title => 1, # ALT attribute allowed
rel => 1, # Link relationship
'*' => 0, # deny all other attributes
},
);
#
my @default = (
0 => # default rule, deny all tags
{
'*' => 1, # default rule, allow all attributes
'href' => qr{^(?!(?:java)?script)}i,
'src' => qr{^(?!(?:java)?script)}i,
'cite' => '(?i-xsm:^(?!(?:java)?script))',
'language' => 0,
'name' => 1, # could be sneaky, but hey ;)
'onblur' => 0,
'onchange' => 0,
'onclick' => 0,
'ondblclick' => 0,
'onerror' => 0,
'onfocus' => 0,
'onkeydown' => 0,
'onkeypress' => 0,
'onkeyup' => 0,
'onload' => 0,
'onmousedown' => 0,
'onmousemove' => 0,
'onmouseout' => 0,
'onmouseover' => 0,
'onmouseup' => 0,
'onreset' => 0,
'onselect' => 0,
'onsubmit' => 0,
'onunload' => 0,
'src' => 0,
'type' => 0,
}
);
#
# Create the scrubber.
#
my $safe = HTML::Scrubber->new();
$safe->allow( @allow );
$safe->rules( @rules );
$safe->default( @default );
# deny HTML Comments
$safe->comment(0);
#
# Update each paramater with the cleaned version
#
my $form = new CGI;
foreach my $p ( $form->param() )
{
my $val = $form->param($p);
$val = $safe->scrub( $val );
$form->param( $p, $val );
}
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.