Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

Notice: my intention in this post is not to start a PHP versus Perl flame war. This post is about good coding practices being applicable to all languages, not about any particular languages weaknesses. Further, for those who wish to flame me about the "Perl problems" that I mention, this thread isn't the place. Heck, those "problems" could simply be personal prejudices. However, the point of this node is not whether or not Perl is perfect; it's about good coding practices.

The Problem

I've recently been learning PHP for a Web site that I need to maintain when I discovered something curious about the language: you can't predeclare variables. In fact, anyone can create a global variable (with any data they want in it) in your code simply by insert an appropriately names form element in the HTML document that has the data they want. There does not appear to be a PHP equivalent of 'use strict'.

My initial thought was to write a Perl script that validates my PHP code and warns me when I have misspelled a variable, used it only once, etc. It was irritating to me that an interesting tool like PHP has such a glaring violation of good coding practice that I started thinking about this a bit differently. I've programmed in quite a few languages and noticed that all of them have problems. Here's a brief sample:

All variables are global.
No references.
Horrible problem with silently mangling dates.
Case-insensitive (maybe that's just a personal gripe).
Can't predeclare variables to catch typos.
Variable variables seem to be encouraged in the documentation.
Perl (didn't think I'd leave it out, did ya?)
Excessive use of globals built in to the language.
OO is kludgy and slow.
Perl's prototypes have some significant issues.

Many have seen that newer programmers often fail to use strict, warnings, taint checking, or many other good programming practices that are suggested to them. I'm here to say you're not only hurting yourself; you're hurting anyone who has to maintain your code. The interesting thing about these programming practices is that they are not Perl-specific. In fact, there are few, if any, languages where these programming practices don't apply.

Why good programming practices are good

Predeclared variables

Let's start off with 'use strict'. Use strict affects variables, references, and subroutines. For the sake of brevity, I'll just cover variables.

Let's face it, when you have a 2000 line program and buried in that program, somewhere, is a variable mis-named %quarterly_reciepts, it's not an easy issue to figure out. Finding a misspelled variable name is a snap when you predeclare variables, but if don't, you may have no idea that your code is spitting out bad output because of a misspelling. You might spend time figuring out if you're reading from your database correctly or wondering if you have a file buffering problem. Why wonder whether or not you've misspelled a variable when you can trap that issue in a couple of seconds and potentially save many, many hours of debugging? I guarantee that programmers coming behind you may not thank you for using strict, but they will curse you if you don't.

Perl has 'use strict' to protect against undeclared variables. VBScript has 'Option Explicit'. Even venerable COBOL has 'Working-Storage' to deal with these issues. If this feature is optional in your language of choice, turn that option on!

Global variables

So, you've written your first module. In fact, you've written an entire suite of modules that share data amongst themselves the programs that use them. Knowing that laziness is a virtue (a false virtue, in this case), you decide to use global variables for some data that everything uses. Here are potential problems with this (some of these are general issues, others are Perl-specific):

  • Months later, when you or someone else comes back to maintain the code, the first question that gets asked is "where the heck is $main::incr set to 5?"
  • In your suite of programs, you have a little bug that munges that global variable and the rest of the code breaks. Hmm... wonder what changed it. Good luck finding out.
  • You want to port the code to mod_perl. Too bad. You no longer use the %main:: namespace.
  • 'use strict' doesn't catch problems with misspelled globals unless you declare them with "our" or "use vars". Many programmers don't understand how those work.
  • Later, a maintenance programmer who works on your code is going to have to try and remember what all of the globals are for. With lexically scoped variables, this is much easier to do.

Modular/orthogonal code

Each piece of code should do one thing and do it well. I think one of the most famous Perl examples of violating this principle is the following misguided attempt to parse form variables.

foreach $pair (@pairs) { ($key, $value) = split (/=/, $pair); # Convert plusses to spaces $key =~ tr/+/ /; # Convert Hex values to ASCII $key =~ s/%([a-fA-F0-9] [a-fA-F0-9])/pack("C", hex($1))/eg; $value =~ tr/+/ /; $value =~ s/%([a-fA-F0-9] [a-fA-F0-9])/pack("C", hex($1))/eg; # Eliminiate SSI's $value =~s/<!--(.|\n)*-->//g; # If we already have a key with this name, allow for # multiple values!!! if ($formdata{$key}) { $formdata{$key} .= ", $value"; } else { $formdata{$key} = $value; } }

See the line that tries to eliminate server side includes ($value =~s/<!--(.|\n)*-->//g;)? Aside from the fact that it's a terribly written regular expression, it also will cut out a lot of HTML comments (in fact, it will pretty much destroy an HTML document if it has more than one comment in it). What happens when you want to include HTML? You have to rewrite this routine, which could cause problems if other code relies on it. A form-parsing routine should parse the form data, that's all. If you want to strip anything out of that data, do it elsewhere.

Code that doesn't have side effects is known as 'orthogonal' code. For example, if you step on the brakes in your car, you don't want it to veer to the left. If you turn on your headlights, you don't want that to automatically trigger your windshield wipers. If you are validating a username and password, don't go out and grab the CNN headlines in the same routine.

Check your system calls

We've all seen it:

open DATA, $data;

If you failed to open the file, your code continues to silently run. If this is embedded in a large system, this could take a long time to track down. Sure, adding the "or die: $!" is more work, but the extra cost of fire insurance is a blessing when your house burns down.


Many newer programmers fail to realize that something is going to go wrong with their code. Maybe the user types a letter instead of the numbers you have on your menu choices. Maybe a function returns an array instead of a reference to one. Maybe, gasp, someone with malicious intent is trying to break your code (hopefully, they're your testing department).

Sometimes, you may think that validating your data is a waste of time. I remember one time that I was writing a program that would summarize commission data and the programmer who wrote the system that I was working on asked me why it was taking so long. I showed her my code and it had gobs of input validation. As it turns out, she had written a wrapper for this system which validated all data long before it got to me. In theory, I could have dispensed with my validation. However, what happens if the input data for the system changes and someone needs to rewrite that wrapper? We all know how easy it is to write buggy code and there's no guarantee that nice, clean data that enters my program today will be clean tomorrow. Remember, you're sleeping with every program that your program ever slept with (okay, that was a rotten analogy).

One of the beautiful things about strong data validation is that you control the error messages. Rather than having a program die a horrible death when it tries to divide by zero, you've already trapped that undeclared variable and have a nice, useful message in the error log.

Factor out common elements

Do you ever find yourself rewriting the same snippet of code? Have you ever had to do a global search and replace on a program? The odds are, you have duplicated something that you should have factored out. Here's a beautiful Javascript example our design department turned out:

function changeLoc(formNum) { if ( document.forms[0].elements[formNum].options[document.forms[0] +.elements[formNum].selectedIndex].value == "Corporate Home" ) { parent.location.href = ""; } else if ( document.forms[0].elements[formNum].options[document.f +orms[0].elements[formNum].selectedIndex].value != "nogo" || page != " +") { top.i3.location.href = document.forms[0].elements[formNum].opt +ions[document.forms[0].elements[formNum].selectedIndex].value; } document.forms[0].elements[formNum].selectedIndex = 0; }

Ooh, that's miserable. After factoring out the appropriate form value:

function changeLoc(formNum) { page = document.forms[0].elements[formNum].options[document.forms[ +0].elements[formNum].selectedIndex].value; if ( page == "Corporate Home" ) { parent.location.href = ""; } else if ( page != "nogo" || page != "") { top.i3.location.href = page; } document.forms[0].elements[formNum].selectedIndex = 0; }

Much better. Now, if we need to tweak the page value at all, we only do it in one place.

For Perl, here's an example from a module I wrote recently (simplified for clarity):

sub update_foo { my ( $self, $data ) = @_; my $id = $data->{ textID }; delete $data->{ textID }; if ( $id !~ /^\d+$/ ) { croak "textID '$id' in update_foo must be numeric."; } my ( $field_values, $values ) = $self->_format_update_data( $data +); my $sql = "UPDATE giText SET $field_values WHERE textID = ?"; push @$values, $id; my $return = $self->_update_database( $sql, $values ); $self->{ _dbh }->commit if ! $self->{ _error }; return $return; }

After rewriting this routine for the third time, I realized that the only thing changing was my ID and the table name. Needless to say, that quickly changed. Now, my "update" methods only validate the ID and supply the correct table name. They are then passed to a generic update method. If I ever need to update that, I only have one place to do it instead of three.


The examples that I gave above were mostly focused on Perl. I did that because this is a Perl-related site and some of the monks who read this may only know Perl. However, the principles are not restricted to Perl. Hence the title 'use strict' is not Perl.

One of the things that really surprised me after I started learning about how to write code well is that I could often judge code quality of languages that I had never used. When I first started learning PHP, I could easily spot rotten programs. I don't know JavaScript well, but I'm constantly cleaning up our design department's Javascript, despite the fact they know it much better than I. Good coding is not language specific.

Whether you are a brand-new programmer or a seasoned veteran, these principles will apply to virtually any programming language you use. Sure, you can't predeclare variables in PHP and COBOL only uses global variables, but that doesn't invalidate the other principles. If you get in the habit of spending a little time up front learning these things, you will be well-rewarded by writing better, tighter code that is much easier to maintain and has fewer bugs.


Vote for paco!

Join the Perlmonks Setiathome Group or just click on the the link and check out our stats.

In reply to 'use strict' is not Perl by Ovid

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or or How to display code and escape characters are good places to start.
Log In?

What's my password?
Create A New User
Domain Nodelet?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (2)
As of 2022-05-28 14:59 GMT
Find Nodes?
    Voting Booth?
    Do you prefer to work remotely?

    Results (99 votes). Check out past polls.