Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

Introduction

Yes, I know the title is bad code. I wanted something memorable so we can point back to this node again and again.

We've seen it again and again. Everybody and their dog at one time or another seems to have toyed with an alternative to CGI.pm. If you think it's too bloated, try CGI::Lite, but don't go rolling your own. This node and (hopefully) the resulting thread, is just something convenient to toss to newbies who aren't aware of the issues involved.

Commons Problems with Alternatives

Here are some common reasons not to use alternatives to CGI.pm:
  • Your version probably won't allow for file uploads. For a good example of why, please check out japhy's online CGI course (particularly chapter 2).1
  • Did you know that color=blue&color=red is a valid query string? Most alternatives don't properly handle multiple values for one parameter. Those that do typically use a null byte (ASCII zero) to deal with this. This leaves the potential for opening up a nasty security hole.2
  • Typically, these alternatives do not allow for any delimeter besides the ampersand. Semi-colons are sometimes used to delimit name/value pairs, but you'd never know it examining most home made alternatives.
  • When was the last time you saw a hand-rolled version verify that the length of data read from STDIN matched $ENV{ CONTENT_LENGTH }? If the browser screws up, you could have corrupt data, but if you don't verify the content length, you'll never know. This, being an intermittant bug, is incredibly difficult to debug.
Those are some of the biggies. The following is a list of complaints that, while not directly related to the "hand-rolled" problem, tend to crop up in the code of those who insist upon doing it themselves.

Related Problems

  • They don't use taint. Have you ever seen any Perl script that used both its own CGI parser and the -T switch. I haven't. I'm not saying it's not out there, but that's the way things seem to work out.
  • They don't use warnings or strict. Okay, this one's not a hard and fast rule, but it's more likely to /cr[oa]p/ up in this situation, IMHO.
  • I routinely see the following in a CGI processing routine:
    $value =~s/<!--(.|\n)*-->//g;
    Not only is that a terrible regex (alternation on single characters, dot star, breaks with multiple comments or SSI's), it has no place in a form processing routine. Two complaints about it:

    1. Modular code should do one thing and do it well. We're processing CGI data here, not trying to strip out HTML comments or server side includes. Keep it simple, damn it!3
    2. If you see this, it means the person has probably cut and pasted the code from someone else and doesn't really understand it. This is a Bad Thing. I don't claim to understand the deep VooDoo of Parse::RecDescent, but at least I can get plenty of others who can attest to its safety and solidity. Not so for Joe Blow's "Steal Me" script archive.
  • This is just some stuff off the top of my head. Please add to the list!
If you want instant verification of this stuff, use Super Search and search for CONTENT_LENGTH in the text of articles. Not all are applicable, but there are some real doozies out there. Here's my favorite:
use CGI qw/:standard/; read(STDIN, $formdata, $ENV{'CONTENT_LENGTH'}); @pairs = split(/\&/, $formdata); foreach $pair (@pairs){ ($name, $value) = split(/=/, $pair); $value =~ tr/+/ /; $value =~ s/%0D%0A/\n/g; $value =~ s/%([a-fA-F0-9][a-fA-F0-9])/pack("C", hex($1))/eg; $FORM{$name} = $value; }
This person is using CGI.pm but still (incorrectly) hand-parsing the data.

Benefits of CGI.pm

No sense in showing you the stick if I don't bother with the carrot.

  • This is some of the most widely tested code you're going to find. Period. Unlike the stuff put out by the Evil Empire, you can go in there and fix it if you don't like it.
  • It's supported. Don't believe the pointy-haired boss "you get what you pay for" argument (I once had a bank manager say that to me about Linux. I should have inquired about his free checking). They also probably say things like "think outside the box", but obviously aren't able to do so. And when was the last time you could fix "supported code" yourself?
  • It forces code reuse. Despite all of the talk about code reuse, few people/companies actually bother with this. Using CGI.pm consistently means that upgrading the module in one spot doesn't mean searching through 500 programs for all of the little tweaks you need. I know that someone hand-rolling code might stuff it in a module, but after they realize the limitations, the odds are, their interfaces will break when they're forced to add the features that CGI.pm already has. (my company comes to mind as a perfect example).
  • You can debug your scripts from the command line. If you've done CGI programming before, you know that you have some unique debugging obstacles. Debugging from the command line solves many headaches!
  • You can use GET or POST methods without a single change to your script. It's almost completely transparent.
  • CGI.pm has become the standard for CGI programming with Perl. If you need help, there are plenty who can and will help you. If you write your own, you'll need help because you wrote a broken alternative. Ask for help with that and you'd better make sure to bring your asbestos undies because you're going to bask in the flames, baby.
  • It's ridiculously easy to use. Do you have 8 checkboxes named "sports" and you want their values in an array?
    use CGI qw/:standard/; my @sports = param( 'sports' );
    My apologies, but if you can't figure that out, maybe you should check out VB.

Cheers,
Ovid

Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.


Footnotes

  1. Yeah, I know that I have an online CGI course, also. The course that japhy is preparing seems to be much more of a rigorous analysis than mine. Mine is targeted at a different audience. (read: japhy's Perl is way better than mine so I pander to the masses :-).
  2. Why should that be a security hole? If you only have one file with a given name, you won't be separating them with null bytes, right? Not necessarily. A wily cracker can simply add another parameter with the same name and your script will politely add a null byte for you. Of course, proper taint checking will stop this, but so will using CGI.pm.
  3. I don't know who first started the annoying habit of trying to strip out SSI's in the parameter processing routine, but here's the potential benefit: let's say you let users sign up at your site and create a home page. You use CGI to capture their home page data and write it to an HTML file, but you don't want to allow people to run SSIs (a huge security hole, if you're configured wrong). This code will strip out SSIs, HTML comments, and everything between them if you have more than one. See Death to Dot Star! if you're unfamiliar with that issue.

In reply to use CGI or die; by Ovid

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (5)
As of 2024-03-28 20:49 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found