Defensive Programming and Audit Trails

Why write a logfile?

When processing data programatically, writing a log of what's being done, and to which data, is extremely helpful. It gives you a blow-by-blow description of what your code's doing, verifying (or falsifying) your assumptions at every step of the process. If it's detailed enough, you can use it to recover from small disasters ("Crap! I didn't mean to delete that record!"), and it's essential in tracking down errors in large, multi-step data grinders. The small cost of keeping a text file around (which, once your program's run is complete, can easily be gzipped...) is negligible compared to the extra debugging muscle.

davorg covered some of this ground quite well in his excellent Data Munging with Perl book; I'm going to expand on the material there, probably repeat some of it, and digress a little bit into good and bad methods of "Defensive Programming".

What to log?

The quick, and usually wrong, answer is "everything!" It's great to know exactly what your code's doing during development, but in production, logging every byte of every packet you process isn't just silly, it defeats the purpose. Remember, most of your logs are going to be mind-numbingly normal, and most of the time, if you're looking at the logs you're looking for problems. Having to sort through megabytes of chaff makes your job much harder.

davorg suggests tuneable levels of logging, and I strongly concur. I'd suggest making the logging level a command-line parameter (the -v switch seems to be popular for this purpose), rather than davorg's suggested environment variable, if only because it's more visible to the user.

In general, it's probably best to log more data than less. Definitely log key data: if you're munging sendmail logs looking for mail to abuse, root, or postmaster, then log the To: address of each mail sent, and whether you processed or discarded the record.

If you're going to be counting on the logs as backups (this may make more sense than doing "real" backups if the "real" backup would entail a database dump, for instance, using much more space than a flat-text log), you will of course have to log all the data necessary to rebuild a useful system. This might mean logging everything, or you may be able to skip "nice-to-have" data that aren't strictly required for functional operation.

Log all error conditions! Log any unexpected input! Make these log entries immediately obvious! (I tend to prefix any such entries with ***ERROR or ***WARNING, depending on severity.) Check all important assumptions made about the data, and as many unimportant ones as is practical.

What not to log?

Any sensitive data. Credit card numbers, passwords, that sort of thing. This may include data that are subject to a privacy policy, like users' email addresses.

How to log

The easiest way is to just write to STDERR. There's no guarantee that anything will get saved: you'll have to redirect the output from the shell. This is sometimes useful, and often catastrophic (when you realize five minutes into a job that can't be backed out that all the output's going to the console, for instance). On the other hand, any logging's better than no logging at all, and writing to STDERR is often good enough, especially if you're just writing a quick hack.

The second-easiest way is to pull something from CPAN and use it instead of rolling your own logging code. A quick search found:

Apache::LogFile
CGI::Log
Log::Agent
Log::Channel
Log::Common
...a billion other Log::Foobar modules...
Net::Daemon::Log
Net::Syslog (useful for remote log servers)
Tie::Log
Unix::Syslog
Win32::EventLog

Whew! If you choose the third way (write your own logging routines), don't bother posting it to CPAN.

Log formats

Keep in mind that your logfiles should be easy for both humans and computers to read. I usually find that I identify a problem by scanning the logs by hand, then write a script to parse them, identifying each instance of a problem and fixing (or at least hiding) it programatically. Try to separate input records with an empty line, for instance. Good coding style usually makes good logging style, too; read perlstyle.

While we're on the subject, the more standard your logs can be, the better, since tools written to work on one set can be applied to another. (Suggestions?)

I don't like XML logs; I find them too difficult to read in a standard text editor (which, if I'm logging in remotely, is probably all I have).

What other resources are available?

LogReport is a general online resource for logging software, knowledge, and practices.

A note about defensive programming

Defensive programming doesn't mean "recover seamlessly from all input errors without a peep", it means "recognize the errors and do what's appropriate". If you don't know that your input's screwed, and your "defensive" program is substituting in reasonable defaults, you could have a big problem (bogus results, broken software upstream making it into production, etc) that doesn't make itself known until it's too late to fix (or at least to fix easily). On the other hand, if your input's broken in a consistent and harmless manner, you don't want kilobytes of benign errors drowning out other, more significant problems in your logs. What to do?

In general, I'd lean towards more logging, rather than less, and more fine-grained control over logging verbosity. If you know that you're going to be encountering the same sort of brain-damaged input over and over (Word-generated HTML, for instance), a simple "got bogus input foo, ignoring" the first time you see it is probably okay.

Update: added readmore tag. Thanks trs80!

Update 2002 Aug. 9: added Resources section

-- F o x t r o t U n i f o r m Found a typo in this node? /msg me

Comment on Defensive Programming and Audit Trails Select or Download Code

Replies are listed 'Best First'.

Re: Defensive Programming and Audit Trails
by Abigail-II (Bishop) on Aug 06, 2002 at 09:26 UTC

Abigail

[reply]

Re: Re: Defensive Programming and Audit Trails

by agentv (Friar) on Aug 06, 2002 at 19:01 UTC

It is not necessary to dump all of your messages into a single log file. One very elegant strategy I encountered recently was to maintain a separate file for messages of each severity level.

Imagine a set of message severity levels:

chatter
progress
information
warning
error

If we use "information" as the default logging level, we would produce three output log files. If the end-user selects a higher level of logging, then files for that severity level would be updated too.

In this software, the designers also had log information sent to STDERR so that you could observe if you felt like it. The syntax model for the command line was like:

program [-nostderr] [-loglevel <severity>]
[download]

As an administrator of that software, I appreciate the ability to ask for as much (or as little) information as I need. I try to write my own programs in that same fashion.

...all the world looks like -well- all the world, when your hammer is perl. ---v

[reply]
[d/l]

Re: Defensive Programming and Audit Trails
by tmiklas (Hermit) on Aug 06, 2002 at 10:48 UTC

you should log all exceptions and failures

enough for you to know exactly what failed, but not enough for others do compromise data your software is work with

the less you know, the better you sleep

[reply]

Re: Defensive Programming and Audit Trails
by tbone1 (Monsignor) on Aug 06, 2002 at 12:47 UTC

First, verbose levels work. They allow you to run production software in a very chatty mode at the drop of a hat. More, you are 99.9% certain that your logging is not hiding the error. (But beware that .1%; and yes, that is the voice of experience that you're hearing.)

Also, don't be afraid to print and use return codes. At one time I considered myself too clever for such things, I could work things out programmatically, etc. "How terribly procedural! I, being a super-genius, shall write methods/functions/whatever to handle that!" In other words, I was young and therefore stupid; I had to get knocked on my gluteus maximus before I realized the value of this. Also, try printing any key values in the message. This can confirm quickly that everything was working up to the point of the error message, or that something went wrong beforehand.

Also, if you are rolling your own error messages, write those messages in such a way that they are easily differentiated from the surrounding, normal logging, yet can be differentiated from one another easily. This allows you to do quick checking with, say, a Perl script.

Oh, and if error logs in XML are not the work of Satan, then it's only because he has subcontracted them.

--
tbone1
As God is my witness, I thought turkeys could fly.

[reply]

Re: Defensive Programming and Audit Trails
by hakkr (Chaplain) on Aug 06, 2002 at 08:53 UTC

The web server handles all logging in terms of access and error logging for most CGI scripts. The detail of what is actually logged often depends on the quality and existance of error capture code. My approach so as not to have to sift through the webserver error logs is to redirect Stderr to a different file for each script using CGI::Carp qw/set_message/

Do you think in these instances logging is therefore unecessary?

[reply]
[d/l]

Re(2): Defensive Programming and Audit Trails

by FoxtrotUniform (Prior) on Aug 06, 2002 at 15:35 UTC

The web server handles all logging in terms of access and error logging for most CGI scripts.

...

If your using DBI the database logs all data with greater efficiency than a hand rolled solution.

...

Do you think in these instances logging is therefore unecessary?

Depends on whether what gets logged is what needs to be, I guess. :-) In general, I'd say no. These logs describe what happens at an interface, and misuse of an interface isn't necessarily the same as assumptions being violated. When it is, its signature can be quite unhelpful.

For example, maybe I'm updating a database through DBI, based on a key built from several bits of data. I pull these different bits out from the input record with regular expression captures, cat them all together (somehow) to build the key, and then do a $sth->execute or similar. Seems reasonable, right?

If one of my regex matches fails -- maybe the input data are corrupt, or one of the upstream programmers changed the record format slightly, or something -- the best I'm likely to do is get a "use of uninitialized value, foo.pl line 666" warning from perl. I'll build a nonsense key, then try updating on it -- and updating on a key that doesn't exist isn't an error, last I checked, just a null op. Now maybe that warning makes sense to me, but if I'm catting three different strings together, any (or all) of which might be the problem, it's going to take a lot of unnecessary effort to track down the problem.

If, on the other hand, I do something like:

  my ($key_part) = ($record =~ /$foo/)
    or warn "***WARNING: input record <$record> doesn't match $foo\n";
[download]

then I know exactly what's wrong, and how it's wrong, just from looking at the logs.

To me, that's a big win.

-- F o x t r o t U n i f o r m Found a typo in this node? /msg me

[reply]
[d/l]
[select]

Re: Defensive Programming and Audit Trails
by cybear (Monk) on Aug 06, 2002 at 13:01 UTC

A good program should log every pertinant piece of information,
working directories, parameters passed on the command line, start times
end times, files read from/written to, etc. The log should be very
friendly to the eyes, and work well with tools such as grep.
A good log will have, lots of space between log sections, obvious signs of positive or
negative results, meaningful error messages, clearly understandable
tags, etc.

For example, this is the log of an FTP program that I wrote:
(PS -it was decided that end times would not be all that helpful in this case)

Date: Tue Aug  6 04:00:00
Ping test results: Working
Using Firewall: xxx.xxxxxxx.com
GEIS ID: 12:??????????
+1 File: /opt/somedir/data/xxx/xxx.2002_08_06.04.00.00


Date: Tue Aug  6 04:25:01
Ping test results: Working
Using Firewall: xxx.xxxxx.com
GEIS ID: ZZ:??????????
-No file to download.


Date: Tue Aug  6 04:25:01
Ping test results: Working
Using Firewall: xxx.xxxxx.com
GEIS ID: ZZ:??????????
-[!!!]Download failed[!!!] check /opt/somedir/detailedLog for details.
[download]

Scanning through this file for errors is easy, visually the
individual entries are short, time stamped and each major
piece of the processing has an individual line in the report.

Additionally successful downloads are marked with a "+",
No data to download is marked with a "-", and errors are
marked with "!!!". These prepended characters, "+" "-" "!!!",
All work with grep pretty well. After a long weekend, with three
of four days of logs to go through, I'll just grep !!! logfile
to quickly get caught up with my log reading.

[reply]
[d/l]
[select]

Re: Defensive Programming and Audit Trails
by dws (Chancellor) on Aug 06, 2002 at 17:48 UTC

dump the environment into the log!

To defend against this, consider adding a logging option that will dump the following in to the log

Checksums (or MD5 hashs) of your components
The versions of all packages you depend on
The versions (or sizes and timestamps) of all .DLLs or shared objects you depend on
The OS version and patch level
The database version (which you can usually query from the database after connecting)
Available disk space
Anything else that's appropriate for your application

Having this info in-hand can save hours of work, particularly when your support folks are dealing with a customer who is yelling loudly that your stuff is broken, and that nothing, nothing! on their end has changed.

And getting set up to record this info forces you to know your application at a deeper level, which can also be a win.

[reply]

Re: Defensive Programming and Audit Trails
by mjeaton (Hermit) on Aug 06, 2002 at 13:33 UTC

Trace was for things that a developer wanted to see like SQL statements, variable info, etc.

Since our support people weren't developers, they would see all the status messages -- startup, shutdown, records inserted, updated and deleted.

If an alert log was written, an email would be generated and a CriticalAlert log would generate a page to the on-call support person.

The dll did a couple of things...it would write a local log file and then broadcast the same information to a server where it was stored in a database. A service-type application would poll the database every n minutes and, depending on the severity of the error, would do nothing, email a list of people or send a page to the person on call.

It was very cool and I haven't seen anything like it since. I make sure I log things like startup, person logged in (username), things involving the data (inserts, updates, deletes) and then finally shutdown. Each entry is timestamped (do ya know how many times I've seen logs with no timestamp -- only the date?) and of course my log files are comma-delimited. It's amazing now many times I've seen people log in a format that is NOT readable with a text editor.

mike

[reply]

Re: Defensive Programming and Audit Trails
by trs80 (Priest) on Aug 06, 2002 at 15:37 UTC


# OO method for error to STDERR uisng 'warn'
# which under mod_perl/Apache is your
# default error_log

sub error_to_log {
    my ($self,$error,$override) = @_;
        if ($self->_debug() || $override) {
        warn "$error\n";
    }
}
[download]

$object->error_to_log("I made it this far, I can't believe it");
[download]

$object->error_to_log("Date: $date Variable: $var",1);
[download]


# $debug is a global variable
# you will need to assign

sub error_to_log {
    my ($error,$override) = @_;
        if ($debug || $override) {
        warn "$error\n";
    }
}
[download]

&error_to_log("Error message");
[download]

[reply]
[d/l]
[select]

Re: Defensive Programming and Audit Trails
by frag (Hermit) on Aug 06, 2002 at 18:29 UTC

ISO 8601

HTTP::Date

plenty

So, when you pick a log file format, give the timestamps you use some thought, and be sure they're consistent.

-- Frag.
"Never could stand that dog."

[reply]

Re: Defensive Programming and Audit Trails
by mattr (Curate) on Aug 07, 2002 at 08:05 UTC

Another use I have found recently is using a log file to figure out what is going on in a child, during development. I had some legacy code and this child was sending email, it was being ported to windows.. well it was very useful.

Sometimes I use shell output redirection so I get stderr and stdout both going to the same file, to log what happened with a system call. I don't want to post the wrong info but it is something like &2>1, see man bash.

[reply]

Re: Defensive Programming and Audit Trails
by Jenda (Abbot) on Aug 09, 2002 at 14:50 UTC

Seems we all agree. Strange ... ;-)

I would suggest one more thing ... before you archive or send to yourself the logs ... process them with a script that'll strip out the repetitive "Everything's OK" messages and similar things. Eg. pretty much all my services/daemons "tick", they write once a minute or whatever the interval is set to a message that just means "I'm still alive and ticking, there's nothing to do for me." I want this info to be sure the service did not get stuck. Once a day I move all log files to archive, replace the groups of "ticks" with simple "Ticking $Since - $To", zip everything up and both archive on the server and mail to me. I also include some info about the log files and the environment to the body of the emails.

The script that "polishes" the log files can (should) parse the logs and add to the message body the number of processed tasks and (more important) the number of suspicious or errorneous messages.

This way you may be the first who knows about a problem and you may fix it before anyone else notices ;-)

Jenda

[reply]

Re(2): Defensive Programming and Audit Trails

by FoxtrotUniform (Prior) on Aug 09, 2002 at 15:29 UTC

Seems we all agree. Strange ... ;-)

I didn't exactly expect that to be a controversial node. The problem is, I keep running into people who just don't think about logging program status, or error conditions, or emitting any sort of status info at all... those are the people I wrote the node for.

On the subject of notification: cron(8) will mail any output to you (well, to $MAILTO) after each cron job. I imagine other similar facilities exist elsewhere. Provided you don't get mail from your crons on a regular basis ("Hey! I completed successfully! Isn't that great?"), mail from cron(8) is an effective error-reporting facility; you don't have to wait until an error-checking/log polishing script runs at some time in the future, you know right away when the job finishes. (Or at least the next time you check your mail.)

-- F o x t r o t U n i f o r m Found a typo in this node? /msg me The hell with paco, vote for Erudil!

[reply]

Re: Re(2): Defensive Programming and Audit Trails

by Jenda (Abbot) on Oct 31, 2002 at 14:28 UTC

The success mail is fine if there are two or three tasks a day. Not if there are tens and hundreds. I don't want to spam myself ;-)

I guess this is the worst problem of logging and error reporting ... you either have too few or too much. But never just as much as you want ;-)

Jenda

[reply]