comment on

Last week, hakkr posted some coding guidelines which I found to be too restrictive, and not addressing enough aspects. Therefore, I've made some guidelines as well. These are my personal guidelines, I'm not enforcing them on anyone else.

Warnings SHOULD be turned on.
Turning on warnings helps you finding problems in your code. But it's only useful if you understand the messages generated. You should also know when to disable warnings - they are warnings after all, pointing out potential problems, but not always bugs.
Larger programs SHOULD use strictness.
The three forms of strictness can help you to prevent making certain mistakes by restricting what you can do. But you should know when it is appropriate to turn off a particular strictness, and regain your freedom.
The return values of system calls SHOULD be checked.
NFS servers will be down, permissions will change, file will disappear, disk will fill up, resources will be used up. System calls can fail for a number of reasons, and failure is not uncommon. Programs should never assume a system call will succeed - they should check for success and deal with failures. The rare case where you don't care whether the call succeeded should have a comment saying so.
All system calls should be checked, including, but not limited to, close, seek, flock, fork and exec.
Programs running on behalf of someone else MUST use tainting; Untaining SHOULD be done by checking for allowed formats.
Daemons listening to sockets (including, but not limited to CGI programs) and suid and sgid programs are potential security holes. Tainting can help securing your programs by tainting data coming from untrusted sources. But it's only useful if you untaint carefully: check for accepted formats.
Programs MUST deal with signals appropriately.
Signals can be sent to the program. There are default actions - but they are not always appropriate. If not, signal handlers need to be installed. Care should be taken since not everything is reentrant. Both pre-5.8.0 and post-5.8.0 have their own issues.
Programs MUST deal with early termination appropriately.
END blocks and __DIE__ handlers should be used if the program needs to clean up after itself, even if the program terminates unexpectedly - for instance due to a signal, an explicite die or a fatal error.
Programs MUST have an exit value of 0 when running succesfully, and a non-0 exit value when there's a failure.
Why break a good UNIX tradition? Different failures should have different exit values.
Daemons SHOULD never write to STDOUT or STDERR but SHOULD use the syslog service to log messages. They should use an appropriate facility and appropriate priorities when logging messages.
Daemons run with no controlling terminal, and usually its standard output and standard error disappear. The syslog service is a standard UNIX utility especially geared towards daemons with a logging need. It allows the system administration to determine what is logged, and where, without the need to modify the (running) program.
Programs SHOULD use Getopt::Long to parse options. Programs MUST follow the POSIX standard for option parsing.
Getopt::Long supports historical style arguments (single dash, single letter, with bundling), POSIX style, and GNU extensions. Programs should accept reasonable synonymes for option names.
Interactive programs MUST print a usage message when called with wrong, incorrect or incomplete options or arguments.
Users should know how to call the program.
Programs SHOULD support the --help and --version options.
--help should print a usage message and exit, while--version should the version number of the program.
Code SHOULD have an exhaustive regression test suite.
Regression tests help catch breakage of code. The regression tests should 'touch' all the code - that is, every piece of code should be executed when running the regression suite. All border should be checked. More tests is usually better than less test. Behaviour on invalid inputs needs to be tested as well.
Code SHOULD be in source control.
And a code source control tool will take care of keeping track of a history or changes log, version numbers and who made the most recent change(s).
All database modifying statements MUST be wrapped inside a transaction.
Your data is likely to be more important than the runtime or codesize of your program. Data integrety should be retained at all costs.
Subroutines in standalone modules SHOULD perform argument checking and MUST NOT assume valid arguments are passed.
Perl doesn't compile check the types of or even the number of arguments. You will have to do that yourself.
Objects SHOULD NOT use data inheritance unless it is appropriate.
This means that "normal" objects, where the attributes are stored inside anonymous hashes or arrays should not be used. Non-OO programs benefit from namespaces and strictness, why shouldn't objects? Use objects based on keying scalars, like fly-weight objects, or inside-out objects. You wouldn't use public attributes in Java all over the place either, would you?
Comment SHOULD be brief and to the point.
If you need lots of comments to explain your code, you may consider rewriting it. Subroutines that have a whole blob of comments describing arguments are return values are suspect. But do document invariants, pre- and postconditions, (mathematical) relationships, theorems, observations and other relevant things the code assumes. Variables with a broad scope might warrant comments too.
POD SHOULD not be interleaved with the code, and is not an alternative for comments.
Comments and POD have two different purposes. Comments are there for the programmer. The person who has to maintain the code. POD is there to create user documentation from. For the person using the code. POD should not be interleaved with the code because this makes it harder to find the code.
Comments, POD and variable names MUST use English.
English is the current Lingua Franca.
Variables SHOULD have an as limited scope as is appropriate.
"No global variables", but better. Just disallowing global variables means you can still have a loop variant with a file-wide scope. Limiting the scope of variables means that loop variants are only known in the body of the loop, temporary variables only in the current block, etc. But sometimes it's useful for a variable to be global, or have a file-wide scope.
Variables with a small scope SHOULD have short names, variables with a broad scope SHOULD have descriptive names.
$array_index_counter is silly; for (my $i = 0; $i < @array; $i ++) { .. } is perfect. But a variable that's used all over the place needs a descriptive name.
Constants (or variables intended to be constant) SHOULD have names in all capitals, (with underscores separating words), so SHOULD IO handles. Package and class names SHOULD use title case, while other variables (including subroutines) SHOULD use lower case, words separated by underscores.
This seems to be quite common in the Perl world.
Custom delimiters SHOULD be tall and skinny.
/, !, | and the four sets of braces are acceptable, #, @ and * are not. Thick delimiters take too much attention. An exception is made for: q $Revision: 1.1.1.1$, because RCS and CVS scan for the dollars.
Operators SHOULD be separated from their operands by whitespace, with a few exceptions.
Whitespace increases readability. The exceptions are:
- Unary +, -, \, ~ and !.
- No whitespace between a comma and its left operand.
Note that there is whitespace between ++ and -- and their operands, and between -> and its operands.
There SHOULD be whitespace between an indentifier and its indices. There SHOULD be whitespace between successive indices.
Taking an index is an operation as well, so there should be whitespace. Obviously, we cannot apply this rule in interpolative contexts.
There SHOULD be whitespace between a subroutine name and its parameters, even if the parameters are surrounded by parens.
Again, readability.
There SHOULD NOT be whitespace after an opening parenthesis, or before a closing parenthesis. There SHOULD NOT be whitespace after an opening indexing bracket or brace, or before a closing indexing bracket or brace.
That is: $array [$key], $hash {$key} and sub ($arg).
The opening brace of a block SHOULD be on the same line as the keyword and the closing brace SHOULD align with the keyword, but short blocks are allowed to be on one line.
This is K&R style bracing, except that we require it for subroutines as well. We do allow map {$_ * $_} @args to be on one line though.
No cuddled elses or elsifs. But the while of a do { } while construct should be on the same line as the closing brace.
It just looks better that way! ;-)
Indents SHOULD be 4 spaces wide. Indents MUST NOT contain tabs.
4 spaces seems to be an often used compromise between the need to make indents stand out, and not getting cornered. Tabs are evil.
Lines MUST NOT exceed 80 characters.
There is just no excuse for that. More than 80 characters means it will wrap in too many situations, leading to hard to read code.
Align code vertically.
This makes code look more pleasing, and it brings attention to the fact similar things are happening on close by lines. Example:
```
    my $var      =  18;
    my $long_var = "Some text";
[download]
```

This is just a first draft. I've probably forgotten some rules.

Abigail

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`