CGI param cleansing

OK, not having anything more at my disposal than CGI.pm, I'm planning on using these two functions to clean input data, and escape javascript output.

I've set the functions up to accept/return either a single string or an array of strings

Is there a faster way to do this? Am I setting myself up for problems?

update: replaced '#' with '&' in test example

use strict;
use warnings;

# strip any non-safe URL characters
# Note: This is not Data validation!  Other
#       code must verify/edit expected results
sub SafeURL {
    my @args = @_;
    local $_;
    foreach (@args) {
        s/[^\w\d.\@-]//gi if defined;
    }
    return wantarray ? @args : pop @args;
}

# Note: escape html covered by CGI escapeHTML()

# escape any non-safe javascript characters
sub EscapeJavaScript {
    my @args = @_;
    local $_;
    foreach (@args) {
        s/([^\w\d.\@-])/uc sprintf("%%%02x",ord($1))/egi
             if defined;
    }
    return wantarray ? @args : pop @args;
}

#####################
# test subs
my @array = qw(
    blah@&blah.blah/<test>
    lalalalal12340as-rqweousn
    //hokey/pokey
);

foreach (@array) {
    my $result1 = SafeURL($_);
    my $result2 = EscapeJavaScript($_);
    print "string: $_\n  SafeURL: $result1\n  EscapeJavaScript: $resul
+t2\n";
}
print "SafeURL array test: " . join(', ', SafeURL(@array)) . "\n";
print "EscapeJavaScript array test: " . join(', ', EscapeJavaScript(@a
+rray)) . "\n";
[download]

Comment on CGI param cleansing Download Code

Replies are listed 'Best First'.
Re: CGI param cleansing by merlyn (Sage) on Jun 02, 2006 at 19:36 UTC
I'm confused at the purpose of this code. There's nothing inherently dangerous in any character you can get from a browser via the `param` subroutine/method. Why do you think you need to "cleanse" them? And on the output, `escapeHTML` should take care of any hand generated items, and the HTML generation subroutines should take care of the rest. What exactly is it that you think you need to "clean"? As an example, suppose I have a filename in `$dangerous` that could contain any possible character in the Unix pathname, and I want to both show its name, and generate a link to it. All I have to do is this: `use CGI qw(a escapeHTML); # amongst other things ... print a({-href => $dangerous}, escapeHTML($dangerous));` [download] No extra code required. -- Randal L. Schwartz, Perl hacker Be sure to read my standard disclaimer if this is a reply.	[reply] [d/l] [select]
Re^2: CGI param cleansing by ambrus (Abbot) on Jun 02, 2006 at 19:50 UTC
You are basically right, however I don't think this line is correct. `print a({-href => $dangerous}, escapeHTML($dangerous));` [download] in this case, you have to uri-escape the filename except for the slashes. Suppose for example that the filename is `"a?b<c"`. Than the above example would print `<a href="a?b<c">a?b<c</a>`. When the viewer clicks on the link, the browser will html-unescape the attribute, and load `a?b<c` prepended with the current base url. The web server would however interpret this as loading the file `a` with the GET parameter being `b<c`. The code should have instead printed `<a href="a%3Fb%3Cc">a?b<c</a>`.	[reply] [d/l] [select]
Re^2: CGI param cleansing by ruzam (Curate) on Jun 02, 2006 at 20:10 UTC
Sorry, I was being vague. Some how abstracting the function from what I was using for has made it less useful. Where I use SafeURL() is for re-using input data as values in URL links as these values (as I've been using them) generally tend to be nothing more than simple text/email strings. For example mycgi.pl was called as `http://mycgi.pl?sect=test` [download] I would use SafeURL() on the value of 'sect' and use it to create new dynamic links `http://mycgi.pl?sect=test;page=22` [download] So I guess the purpose of SafeURL() is to make the data safe to feed back into a new URL, but that appears to be specific to my use, and probably not of much use beyond. Seemed like a good idea at the time :) EscapeJavaScript() still has it's uses, unless CGI provides an equivalent function?	[reply] [d/l] [select]