in reply to Re (tilly) 2: Opinions needed on CGI security
in thread Opinions needed on CGI security

I understand completely about filtering by what you will accept and not trying to imagine what to reject... I said as much in my post. My question is, if you have a CGI that does:
$a = "some CGI data <<script blah blah evil stuff"; open (F, ">>file.txt"); print F $a;
... and that's the sum total of the CGI's interaction with the rest of the world, what could a hacker (or anyone) do that's evil? Now, if you will (say) be outputting a web page based on this data later on that's a different story... but that's not the question.

My point is that I agree wholeheartedly that we should be as diligent as necessary to secure our programs and our data. But at some point (and this is a good example) "diligence" turns into unecessary paranoia.

Gary Blackburn
Trained Killer

Update: Ok, so maybe the point from the original poster was to use the data to populate a web page. :-P Seems to me in that case that there's no reliable way of filtering out all possible evil HTML/Javascript (please, someone correct me if there is). But other than that, what else does the poster need to do?

Replies are listed 'Best First'.
Re: Re: Re (tilly) 2: Opinions needed on CGI security
by AgentM (Curate) on Feb 14, 2001 at 09:56 UTC
    Along merlyn's lines, an easy way to ensure that the user remains within your bounds is to imagine what Apache does when one specifies Order deny, allow If the input doesn't meet your guidelines, reject the whole thing on sight of the first mistake. Yes, simply return to the user "no dice". It's simply not worth trying to make replacements which need to be concurrently checked. In the example above, if one wants to reject certain data, then reject the whole shebang if any "evil hacker stuff" is included. That means, first clearly define which tags ARE acceptable. If ANYTHING else is used, reject the entire input asking for user clarification. That's not exactly what this site does, but PM does restrict tags along these lines.

    The other way is just too hazy. From CGI pod (I've always found this entertaining):

    If you import a function name that is not part of CGI.pm, the module will treat it as a new HTML tag and generate the appropriate subroutine. You can then use it like any other HTML tag. This is to provide for the rapidly-evolving HTML "standard." For example, say Microsoft comes out with a new tag called <GRADIENT> (which causes the user's desktop to be flooded with a rotating gradient fill until his machine reboots). You don't need to wait for a new version of CGI.pm to start using it immediately:

    use CGI qw/:standard :html3 gradient/; print gradient({-start=>'red',-end=>'blue'});
    If you only filter script tags, then you're missing this DoS HTML tag. On the other hand, if you know which tags are good and ignore all others, you're set for life without trying to track down new exploits.
    AgentM Systems nor Nasca Enterprises nor Bone::Easy nor Macperl is responsible for the comments made by AgentM. Remember, you can build any logical system with NOR.
Re (tilly) 4: Opinions needed on CGI security
by tilly (Archbishop) on Feb 14, 2001 at 17:17 UTC
    The original poster was apparently trying to make the result safe to display. If that is your goal, then it did not succeed. Conversely if that is not the goal then it should have done nothing.

    Also it is possible to do this safely. For a try at this from me, take a look at Functional take 2.

Reroh Rorge: Opinions needed on CGI security
by baku (Scribe) on Feb 14, 2001 at 19:33 UTC

    There are a few ways to get almost all known HTML/JS evils out of the way...

    The simplest, and a very effective one, is to simply URL-encode everything that comes in, like the PerlMonks.Com <code> tag does. The following JavaScript is harmless: <script>alert("I am malevolent");</script> because it has turned into &lt;script&gt;... before your browser sees it.

    If you like certain HTML constructs, allow only them, like the <p> & <em> tags I'm using in this post (but not the <form> tag here: <form><input type="text" size="2"></form>

    For better safety, as well as flexibility in presentation (HTML, WML, PDF, &c.) using an HTML->internal form->presentation form sequence might be desireable; e.g. using an XML dialect with no scripting, &c. internally.

    The only "badness" I know of which can't be readily filtered out this was is an hyperlink containing potentially malicious content, e.g. a link to a site that does evil things, or (but I don't think any current browsers are troubled by this) a buffer overrun in the URL itself or sommat.

    But, I'm sure someone will think of something interesting that can be done with <p> in IE 6, and we'll all be back to the drawing board :-)

Re: Re: Re (tilly) 2: Opinions needed on CGI security
by MeowChow (Vicar) on Feb 14, 2001 at 09:37 UTC
    There are certainly ways of filtering out all possibly evil HTML and Javascript, but they may be too restrictive for your application. For instance, you could launder CGI data through /([a-zA-Z0-9_&;\s]*)/, which would disallow all HTML except for entities, but this would be much too restrictive for a site like PerlMonks where we need to be able to post code.

    Constructing a character class that filters out bad stuff is trivial. On the other hand, constructing a hack-proof set of regexen that permit specific combinations of characters while disallowing others (as in allow <a> but disallow <script> while allowing '<' and '>' if inside a code block) is far from easy.

    Everything's implementation of the latter is something you might want to take a look at.

       MeowChow                                   
                   s aamecha.s a..a\u$&owag.print
Re:(tilly) 2: Opinions needed on CGI security
by Gryphaan (Beadle) on Feb 14, 2001 at 17:42 UTC
    Thank you all for the comments so far.

    The program is a message board, so the only time the user data is used directly is when a page is generated.
    I was trying to use the method of specifying what I will allow as mentioned by Merlin. Of all the potential input fields, there are 4 that I had to specify what I won't allow instead of what I will allow. This is because these fields can contain HTML. Since there are a lot of acceptable tags I thought it prudent to specify the ones I don't want.

    I've read all I can find on security with CGI's and never found much that directly related to my program, but after seeing all the different methods used to attack a program I thought I'd better do some basic filtering.
    This problem grew because I don't spend my time trying to break other peoples code, I am probably unaware of common "hack" attempts.

    I know my code leave it possible to have an unbalanced tag like the <table> tag and thus the generated page may not display but I haven't found any method that will match opening and closing tags.

    In the hopes that I'm not becoming completly paranoid, is there any standard filtering that I'm not using to minimize vulnerability ?

    Thanks for all the advice, you guys will make a programmer out of me yet!

    -- Brian