Re^2: newb: Best way to protect CGI from non-form invocation?

(most of these are less effective at stopping persistent abuser, but shouldn't stop any valid postings)

robots.txt file to stop the innocent search engines from posting
one time keys in hidden inputs to track multiple postings to the form without a refresh
timestamps in hidden input to track the length of delay between form request and submission. (for some forms, this isn't appropriate, but you can refill the form and ask them to resubmit and/or do something to confirm they aren't a bot)
input validation to ensure that the form hasn't been bypassed (eg, make sure select values are options that were on the form)
user-agent filtering as there have in the past been signatures of known misbehaving bots, and you might be able to identify a single abusive system/signature
rate limiting on all submissions to your system, rather than just a random per-submission delay. (so the more submissions to the site, the longer the delays introduced ... normally to slow down ballot stuffing so that admins can deal with it)

Oh and for the original poster -- and there are plenty of capchas that don't discriminate against visually-impaired, but may cause problems for some other subset of users. Some simple ones are math problems (arithmetic, not calculus) or 'spot the member that's different' where alt text can work (eg, 8 bird species and a dog breed). I've even seen 'write 2 in the box'. Of course, CAPTCHAs don't work. See If CAPTCHA isn't the answer. What is? for more details.

Oh -- and a timestamp hashed against the IP address makes a fairly effective combined one time key and timestamp.

Comment on Re^2: newb: Best way to protect CGI from non-form invocation?

Replies are listed 'Best First'.
Re^3: newb: Best way to protect CGI from non-form invocation? by radiantmatrix (Parson) on Feb 08, 2007 at 15:51 UTC
I would strongly recommend against freely mixing languages (e.g. Perl and PHP) in a single application. If you've already got a Perl application, use it to generate the time stamps and insert them in the output (HTML) that is your form. Some information that may be of interest in this matter: HTML::Template is an easy way to use template files for HTML; in this case, you could have a template variable wherever you want the timestamp/etc. to appear in your output. the manual pages for time, sprintf, and the POSIX module are probably useful for dealing with times and conversion. Also, a CPAN search for DateTime is informative for any complicated date math. I would keep complexity down until and unless you need it (that's always true, I think). A refresher on CGI::Simple is a good idea as well Read up on the W3C's WWW Security FAQ There's a section of CGI Programming with Perl titled Security that should be helpful <–radiant.matrix–> Ramblings and references The Code that can be seen is not the true Code I haven't found a problem yet that can't be solved by a well-placed trebuchet	[reply]
Re^4: newb: Best way to protect CGI from non-form invocation? by JCHallgren (Sexton) on Feb 09, 2007 at 03:23 UTC
At this point, I'm going to have to add some PHP code anyway to site to try and keep (by valid users) certain pages from being viewed in sequence other than I wish. This may possibly have side benefit of helping to block form page from being invoked by invalid users. And the only Perl used currently are two (back-end) programs to process the two input forms. The rest of site is just plain HTML (ok, a bit of JavaScript for some cookie handling) for display purposes, so calling it an application is a bit maybe of overkill!	[reply]
Re^3: newb: Best way to protect CGI from non-form invocation? by JCHallgren (Sexton) on Feb 08, 2007 at 04:10 UTC
I DO appreciate your lengthy reply...but I have some follow-up questions: 1) Any suggested coding to implement 'timestamps' as you described? I'm not that sure how one would generate the timestamp...PHP maybe? (I'm starting to look at how to code PHP also) I presume that almost no delay between form send and reply would indicate a non-human, as nobody would type THAT quick, right? 2) Same question about 'one time keys', but also..I didn't fully follow how this would work...could you explain just a bit more?	[reply]
Re^4: newb: Best way to protect CGI from non-form invocation? by jhourcle (Prior) on Feb 09, 2007 at 20:40 UTC
Okay, an explanation (sorry for the delay) I use Perl to generate my forms. (by default, they generate empty forms, but when I see an input error, I generate the partially completed form, marking the inputs that were in error.) Therefore, inserting a timestamp is easy -- you just generate it and slip it in a hidden form input (I guess you could use cookies, too). It's possible that you could have it generated with javascript, but if it's client side, and their clock is off, you'll run into problems. You could use 'AJAX' or whatever they want to call to make a call back to your system for the time, but If they don't have javascript they'll still have problems. For the timestamps, I hadn't thought about the issue of them submitting too quickly -- I was looking at them submitting too slowly (eg, hours/days apart, which is typical for some web spiders if you have a large site), or someone who crawls the site once for forms, then comes back later to run a job against it. ... As for the one time keys -- you generate a random number (or not random, but something unique), and you keep track of what keys have been issued. When the form is submitted, you remove it from the list you're tracking. If a form comes in with a key that you're not tracking, you reject them. This requires a little more overhead (the types of forms that I'm protecting like this typically use a database, so I just maintain a table with the keys issued; when a key gets issued or checked , I do some garbage collection and delete keys that are too old (eg, `delete from keys where issuetime < sysdate-1`) This doesn't work well for the situations where it'd be normal for the person to hit back, change a value, and then submit again. (eg, search engines) You probably wouldn't be trying this sort of protection against such sites, but you could set the appropriate caching headers to try to force them to refesh when they go back.	[reply] [d/l]
Re^3: newb: Best way to protect CGI from non-form invocation? by radiantmatrix (Parson) on Feb 08, 2007 at 15:43 UTC
I participated in one of those threads, and as I pointed out in Re: If CAPTCHA isn't the answer. What is?, a CAPTCHA isn't what most people seem to think it is. There are also a few additional suggestions in terms of proving someone is a human. I think the most worthy point of the articles you references is that nothing is a perfect solution. SPAM is a fact of life, and anything you do will (a)be a tradeoff and (b)fail to stop all attacks. <–radiant.matrix–> Ramblings and references The Code that can be seen is not the true Code I haven't found a problem yet that can't be solved by a well-placed trebuchet	[reply]