in reply to newb: Best way to protect CGI from non-form invocation?

A quick and dirty trick is to add a text field (not a hidden field) named subject to your form. Hide this field from your users using CSS (input[name="subject"] { display: none; }). Most spam bots will fill that field. If that field is set, assume the form was submitted by a bot.

This trick can be used in conjunction with other methods for defense in depth.

Replies are listed 'Best First'.
Re^2: newb: Best way to protect CGI from non-form invocation?
by chromatic (Archbishop) on Feb 05, 2007 at 22:10 UTC
    If that field is set, assume the form was submitted by a bot.

    ... or a real user whose CSS settings differ from your expectations.

      That's no biggie. You can include a warning on the form that would be normally hidden by the same mechanism that hid the input field.

      Title: [_________________________] Text: [_________________________] [_________________________] [_________________________] [_________________________] [_________________________] LEAVE EMPTY!! [_] <- "subject" field. Normally hidden by CSS. Only non-CSS clients and overriding CSS clients will see.

      And when you receive a form with the field set, you could republish the form (pre-populated) to the client, asking him to resubmit it with the field empty.

      Thanks to those who have SO quickly replied! :)

      Due to old age and vision of my users, the captcha method is pretty much out...and restricted posting is also undesired, so I'm kinda limited to something behind the scenes...

      Assuming the spam is via a bot, how exactly does it find my form on site? And the data I'm getting is much longer than the field size limits on web page, so they either are using their own variant of my page (which I'd need to try and block) or what? If they are humans typing in spam on my site, then it couldn't be as lengthy as I'm seeing..Or?
        Assuming the spam is via a bot, how exactly does it find my form on site?

        It spiders your site via HTTP and parses the HTML returned, looking for suspicious-looking form tags.</>

        And the data I'm getting is much longer than the field size limits on web page, so they either are using their own variant of my page (which I'd need to try and block) or what?

        All a form tag implies is that connecting via HTTP to the URI in the action attribute produces some action, and that it may or may not do anything with the form parameters submitted. If you can construct an HTTP request by yourself, you don't need the form.

        That's how forum spammers and web services work.

        You could use length or some other pattern matching to catch spam. I would suggest a really big captcha though, if your users are vision impaired then increase the size. One of the best combos is to have a captcha, but allow registered users to by pass it by logging in. Then your normal users aren't bothered and you keep the spammers out. I've used this very successfully in the past, pair it up with an IP based time limit and you'll keep 99% of unwanted spam out and not bother your users too much. PS my captcha was actual words instead of random text to make it easier on users. Like with any security measure the goal is to balance the strength of your security, your users needs, and the benefits of the security you are added. If you only get the normal bots that don't try to hard then measures like this are very effective. If you think you are dealing with an individual determined to spam your site then you might need a very different set of security measures.


        ___________
        Eric Hodges