Introduction
Yes, I know the title is bad code. I wanted something memorable so we can point back to this node again and again.We've seen it again and again. Everybody and their dog at one time or another seems to have toyed with an alternative to CGI.pm. If you think it's too bloated, try CGI::Lite, but don't go rolling your own. This node and (hopefully) the resulting thread, is just something convenient to toss to newbies who aren't aware of the issues involved.
Commons Problems with Alternatives
Here are some common reasons not to use alternatives to CGI.pm:- Your version probably won't allow for file uploads. For a good example of why, please check out japhy's online CGI course (particularly chapter 2).1
- Did you know that color=blue&color=red is a valid query string? Most alternatives don't properly handle multiple values for one parameter. Those that do typically use a null byte (ASCII zero) to deal with this. This leaves the potential for opening up a nasty security hole.2
- Typically, these alternatives do not allow for any delimeter besides the ampersand. Semi-colons are sometimes used to delimit name/value pairs, but you'd never know it examining most home made alternatives.
- When was the last time you saw a hand-rolled version verify that the length of data read from STDIN matched $ENV{ CONTENT_LENGTH }? If the browser screws up, you could have corrupt data, but if you don't verify the content length, you'll never know. This, being an intermittant bug, is incredibly difficult to debug.
Related Problems
- They don't use taint. Have you ever seen any Perl script that used both its own CGI parser and the -T switch. I haven't. I'm not saying it's not out there, but that's the way things seem to work out.
- They don't use warnings or strict. Okay, this one's not a hard and fast rule, but it's more likely to /cr[oa]p/ up in this situation, IMHO.
- I routinely see the following in a CGI processing routine:
Not only is that a terrible regex (alternation on single characters, dot star, breaks with multiple comments or SSI's), it has no place in a form processing routine. Two complaints about it:$value =~s/<!--(.|\n)*-->//g;- Modular code should do one thing and do it well. We're processing CGI data here, not trying to strip out HTML comments or server side includes. Keep it simple, damn it!3
- If you see this, it means the person has probably cut and pasted the code from someone else and doesn't really understand it. This is a Bad Thing. I don't claim to understand the deep VooDoo of Parse::RecDescent, but at least I can get plenty of others who can attest to its safety and solidity. Not so for Joe Blow's "Steal Me" script archive.
- This is just some stuff off the top of my head. Please add to the list!
This person is using CGI.pm but still (incorrectly) hand-parsing the data.use CGI qw/:standard/; read(STDIN, $formdata, $ENV{'CONTENT_LENGTH'}); @pairs = split(/\&/, $formdata); foreach $pair (@pairs){ ($name, $value) = split(/=/, $pair); $value =~ tr/+/ /; $value =~ s/%0D%0A/\n/g; $value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg; $FORM{$name} = $value; }
Benefits of CGI.pm
No sense in showing you the stick if I don't bother with the carrot.
- This is some of the most widely tested code you're going to find. Period. Unlike the stuff put out by the Evil Empire, you can go in there and fix it if you don't like it.
- It's supported. Don't believe the pointy-haired boss "you get what you pay for" argument (I once had a bank manager say that to me about Linux. I should have inquired about his free checking). They also probably say things like "think outside the box", but obviously aren't able to do so. And when was the last time you could fix "supported code" yourself?
- It forces code reuse. Despite all of the talk about code reuse, few people/companies actually bother with this. Using CGI.pm consistently means that upgrading the module in one spot doesn't mean searching through 500 programs for all of the little tweaks you need. I know that someone hand-rolling code might stuff it in a module, but after they realize the limitations, the odds are, their interfaces will break when they're forced to add the features that CGI.pm already has. (my company comes to mind as a perfect example).
- You can debug your scripts from the command line. If you've done CGI programming before, you know that you have some unique debugging obstacles. Debugging from the command line solves many headaches!
- You can use GET or POST methods without a single change to your script. It's almost completely transparent.
- CGI.pm has become the standard for CGI programming with Perl. If you need help, there are plenty who can and will help you. If you write your own, you'll need help because you wrote a broken alternative. Ask for help with that and you'd better make sure to bring your asbestos undies because you're going to bask in the flames, baby.
- It's ridiculously easy to use. Do you have 8 checkboxes named "sports" and you want their values in an array?
My apologies, but if you can't figure that out, maybe you should check out VB.use CGI qw/:standard/; my @sports = param( 'sports' );
Cheers,
Ovid
Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.
Footnotes
- Yeah, I know that I have an online CGI course, also. The course that japhy is preparing seems to be much more of a rigorous analysis than mine. Mine is targeted at a different audience. (read: japhy's Perl is way better than mine so I pander to the masses :-).
- Why should that be a security hole? If you only have one file with a given name, you won't be separating them with null bytes, right? Not necessarily. A wily cracker can simply add another parameter with the same name and your script will politely add a null byte for you. Of course, proper taint checking will stop this, but so will using CGI.pm.
- I don't know who first started the annoying habit of trying to strip out SSI's in the parameter processing routine, but here's the potential benefit: let's say you let users sign up at your site and create a home page. You use CGI to capture their home page data and write it to an HTML file, but you don't want to allow people to run SSIs (a huge security hole, if you're configured wrong). This code will strip out SSIs, HTML comments, and everything between them if you have more than one. See Death to Dot Star! if you're unfamiliar with that issue.
Back to
Meditations