markjugg has asked for the wisdom of the Perl Monks concerning the following question:
Now that I'm becoming a wiser and more experienced perl hacker, I'm learning that good error handling systems are a bit of an art form. The book "Practice of Programming" recommends to do something like "detect errors at a low level, deal with them on a high level". This is a philosophy I can appreciate.
To illustrate my discussion, here's a error handling subroutine that could be used by someone checking the return values from a Postgres insert from a CGI form:
sub is_pg_date_error {
my $key = shift;
my $value = param($key);
if ($DBI::errstr =~ m/Bad date external representation '$value'/o)
+ {
error(title=>'Inproper date format',
msg=>"There is a problem with $key field. Please return an
+d format it
in the format YYYY-MM-DD and confirm that it is a valid da
+te.")
}
}
This is probably bad style, because it detects the error it a low level (where this routine is hidden away), but it doesn't allow the error to handled at a high level-- it throws an error itself immediately. The issue here is that most of the time I do want to throw this error. So I want to be able to handle the error at a high level, but for it happen automatically if I want to.
Looking at how CGI.pm and DBI.pm deal with errors, they both have similiar approaches which are inline with the "Practice of Programming" principles-- when there is an error, they populate some global routines in their package with the relevent errors. You can choose to do something with these errors or just ignore them.
So, when developing custom modules, would recommend the same error handling scheme in general? That is, when there is an error, instead of dealing with directly or returning it to user, stick it into a package global which the user can manipulate on their own? If not that, what other insights on the art of error handling do you have? Thanks!
-mark
Re: The art of error handling
by mikfire (Deacon) on Dec 19, 2000 at 22:06 UTC
|
The short fall of mirod's scheme is that is usually only
one right answer, but there infinite wrong answers. For example,
if I write some code that opens a file for reading, many things
can go wrong. The file might not exist, the file might not
be readable by the current user, the file might be empty, etc.
If you are trying to give meaningful errors to the calling
routines, you have exhausted the 0 or undef on failure option.
All the calling code can do is say "Yup, something bad
happened".
This gets even more complex when the low level routines are
interacting with system(). To continue the example above,
lets say the subroutine opens a file, reads a command from it
and executes it via system() ( yes, security sucks, but this
is an example ). You have not only the problems listed above,
but now you need to worry about the error codes generated by
the system() call as well.
My general method ( I am contemplating shining this up and
attempting to present it at YAPC ) is I have a module that
provides a series of simple functions ( Debug, Warn, Error,
Usage, Notify ), a series of control variables ( don't
print the message, die when Error is called, etc. ) and a
whole slew of constants. The constants are defined in a hash,
and the import routine makes constant functions of them
and shoves them into the caller's space. There is one special
value - NO_ERROR which is numerically 0. I use this since
system() returns 0 on success ( it has some other interesting
side affects ). The defined error codes are negative and
any system() call returns the correctly left shifted response
on error.
This allows the calling functions to determine if the error
was internal ( eg, "Before I can do x, you need to give
me y and z" ) or external ( eg, "That command does not
exist" ). With a few more functions, you can even turn the
numeric code back into something a bit more human readable.
Returning useful error codes is difficult at best. This is
a long-winded statement of my current solution to this problem
in the hopes it may help you. Sorry for the lack of code.
If enough interest is given, I may clean the module up enough
for the Catacombs.
Mik
mikfire | [reply] |
|
I agree my method is not perfect, I'm just saying that I
try to use it as often as possible. It also makes me
write simple, atomic functions that do only one thing and that
return either a valid result or a single failure. And the
calling code most of the time dies in case of
failure.
I find it very rare that a function should fail but the
calling code will behave differently depending on the error
| [reply] |
Re: The art of error handling
by mirod (Canon) on Dec 19, 2000 at 21:47 UTC
|
I usually die when either the error
is obviously fatal or is not supposed to happen at all
(if it happens then there is a bug in the program).
Otherwise I return a false (0) or undef value. False
is nicer to test than undef but it is not
always possible as 0 might be a valid returned value.
If undef is also a valid returned value then I am
in trouble ;--) I try to code around it though, making it
an illegal result.
| [reply] |
|
Since you mentioned CGI (thus, CGI app writing), it is extremely important to return some error code and message to the user/browser (this is called "user-friendliness"). This is not so intuitive until you use CGI::Carp or better yet, you own personalized error message system which fits in with the theme of the rest of the site. The problem I see with global variables is that they become to cluttery and are named differently for every module. I find function return codes far more useful and clear. Plus, you can put the whole shebang in the same statement!
die "Oops!\n" unless evilfunction(); #or "if", if you prefer
I suppose the use of the global vars carried over from C when you were basically stuck returning static variables for multi-variant-byte length information. Of course, Perl does not have this restriction and thus would recommend using returns. My favorite scheme is 0 for success, some string or number on failure- preferably something I might be able to output to the user. This is definitely that something that has come with me from my UN*X C background (POSIX likes to make success 0 now).
AgentM Systems nor Nasca Enterprises nor
Bone::Easy nor Macperl is responsible for the
comments made by
AgentM. Remember, you can build any logical system with NOR.
| [reply] [d/l] |
|
Since you mentioned CGI (thus, CGI app writing), it is extremely important to return some error code and message to the user/browser (this is called "user-friendliness"). This is not so intuitive until you use CGI::Carp or better yet, you own personalized error message system which fits in with the theme of the rest of the site.
But don't be too friendly. Never tell anything about why an error happened. Just say "something broke, we already know about it", no matter what went wrong. Do not say "database error" or "invalid URL" or anything that reveals
any kind of nature about why.
Yes, it's human nature to want to know (and to reveal), but it's a evil person's nature to use that information to reveal potential breakin paths. So don't. Just say no to "helpful error messages".
-- Randal L. Schwartz, Perl hacker
| [reply] |
|
|
|
My personal opinion is to never, ever, ever die when using CGI.
The most annoying thing to me (as a web user) is when something doesn't work and I'm just presented with an error. That pisses me off more than anything else. If I get an irrecoverable error (very, very rare), I want to see an error page that gives me options for notification, ETA on fixes, or other things I can do with my data that I just spent 15 minutes filling out.
And just to digress for just a moment, 404 pages suck. If the page isn't found, it's not there, but just don't leave me hanging, give me options or redirect somewhere. I went once to a lecture on usability, and while the lecture sucked, the speaker made one good point; that is that as programmers, we spend far too much time just quitting out of apps when something goes wrong. More care needs to be taken in handling the error. Sending an error message back to the user doesn't actually "handle" the error. It just passes it on for the end-user to handle, often times with no options whatsoever.
| [reply] |
|
return($success_ref, $error_ref);
Each subroutine returns either
$dbh->prepare($query) || return (0, "Error: " . $dbh->errstr);
or
return ($array_ref, 0);
You can pepper these throughout your code anywhere there's a DBI action able to fail and it ensures that you can fail somewhat gracefully.
It's not perfect, but it also makes error-checking easy (just test whether $errors contains anything) and allows you to develop a cascading series of calls where it doesn't bother executing subsequent queries if one has already failed. | [reply] [d/l] [select] |
|
But if you return undef or 0, the error is unspecific, it just means there was a failure... how do you address the specifics? I have been struggling with error checking myself, so I am quite curious what responses will appear.
Update: By the way, the way that I have been handling errors in modules is to have a sub called error() which takes a string, stores it in an ERRORS array in the package, and return undef, so I can have code like this:
open(TEST, ">$file") or return $self->error("Failed to open $file");
That way, I get the undef return value, and store the error in the array in one shot. There's probably a few better ways, though.
Hot Pastrami | [reply] [d/l] |
|
If you use subroutine/methods which perform simple
operations then the calling code usually either
ignores the error, fall back to another method (
$val= $obj->method || default) or dies quickly.
So I usually handle errors at the level just above the
method/subroutine where it happens. Not quite the
"detect at the lowest level and at the highest level"
philosophy, I admit, but it usually works for me.
| [reply] |
Re: The art of error handling
by turnstep (Parson) on Dec 20, 2000 at 04:54 UTC
|
For the particular example CGI/DBI, I use a custom
error-handling subroutine that does all the work for me.
The only thing I have to pass (usually) is a simple
error code, and perhaps the offencding SQL. For example:
my $dbh = DBI->connect($foo,$bar,$baz,{AutoCommit=>1})
or &Error(1);
my $sth = $dbh->prepare($SQL_TEXT)
or &Error(2,$SQL_TEXT);
The idea is to pass as little as possible, and let the subroutine
deal with it. $DBI::errstr is already global, so it is not
passed. The handy dandy caller function tells us
exactly where this error came from. The error function then
can react different ways, depending on the situation. For most
users, it will display a "we are having probems with the
database" message, log the error internally somewhere, and
exit there (after some quick cleanup, e.g. $dbh->disconnect).
For users that are markd as a "developer", a different
error message is returned, this one usually has at least
$DBI::errstr, the file and line number where it occured,
and additional SQL used (formatted for the browser in pretty
colors, of course, etc.) One of the biggest advantages to
such an approach is the ease in which you can change things -
say you want users to see a different message - voila! change
it in one place, your error handling routine.
More specific errors, such as the wrong format in the
field, should be caught before they go into the database,
IMO. Javascript with a safety net of some simple perl
checks go a long way towards making this happen. Avoid
ever feeding a database "bad" data, as your program will
have to rely on the DB's error message to figure out what
is going on.
As far as making custom modules, I am a big fan of setting
globals. In the case above, it really really helps that
DBI sets $DBI::errstr and let's ME (e.g. the script) deal
with it how I see fit. Sometimes I use the information
inside it, sometimes I don't. I prefer methods that return
simply true or false, then set a global.
| [reply] [d/l] |
|
%in = (
title => undef,
msg => undef,
debug => 0,
html_action =>1,
log_action => 1,
death => 1,
@_
);
By default it returns a message in HTML to the user with the title and msg, but you can also turn on debugging output for programmers, and optionally log the error someplace if you want. It differs from your system in at least one significant way:
Your error handling routine apparently stores the full error messages someplace, so error(1) would expand into something meaningful. This has the drawback that another programmer (or even a forgetful self) would have to go look up the code to see what error was being thrown. This has the advantage of producing short code, and not repeating the same error message, but it's a bit more obfuscated. In my system, the title and message is required for every message, which makes it clear what's happening in the main code line, but it does appear more cluttered, and I end up copying and pasting common errors around. I was thinking of developing a comprise solution: I would just require the title field to be passed. If no msg attribute was present, it would look for default msg in an error hash, with the title as the key. I like this because the key is now a short phrase that will hopefully have some meaning to humans, and be more memorable than a number. However, it allows a succint error handling syntax, and the ability to reuse the text of common error messages.
-mark | [reply] [d/l] |
Re: The art of error handling
by Hot Pastrami (Monk) on Dec 19, 2000 at 22:19 UTC
|
On this subject, I have seen code where the developer used prototyped subs to mimic try/catch exception handling:
try {
open(TEST, ">$file");
};
catch {
push(@errors, "Failed to open $file");
return undef;
};
This basically eval()s the try block, and the catch checks the value of $@. Is this good style? I would think that the eval() overhead would be prohibitive, that is why I have avoided using it.
Hot Pastrami | [reply] [d/l] |
|
It's not particularly good style, IMO. The reader has to constantly remind themselves what your particular definition of try and catch do (in particular, what their side-effects are).
Such temptations exist in other languages as well ... have you ever had to maintain e.g. C code where the original programmer decided to #define their own aliases for standard C keywords, perhaps in imitation of some other language?
eval {} and if ($@) {} may not be quite as attractive, but it is immediately clear to anyone with appreciable perl experience what they do (including side-effects). Fewer bytes too. ^_^
| [reply] [d/l] [select] |
Re: The art of error handling
by clscott (Friar) on Dec 20, 2000 at 04:36 UTC
|
I think you're checking the user input at the wrong point if you're relying on
the DBI::errstr to tell you that user input is invalid.
From a security and usability stand point you should test your data long before
you try and dump it into the database. I think the security part stands on it's
own without further explanation.
The earlier you detect a problem the easier it tends to be to deal with it.
What if this program routinely took 10 or 15 minutes to get to the point where
the insert happens and DBI::errstr is populated? Your users will be upset.
Better options for detecting and reporting errors in web apps (in order of increasing ease of user interaction):
(I actually skipped one option, but it relies on your application using a
really good templating system and being one cgi with a lot of conditionally
generated html: It's process the page on the server and return it with the
offending fields highlighted and the error message above or below the field
explaining the problem. The top of the page should declare that there is a
problem and to pay attention to the indicators yadda yadda yadda.)
If you continue down the road you're on now are you really prepared to grep
DBI::errstr for every partial error string that could be tossed from within the
database?
--
Clayton
| [reply] [d/l] |
|
It's true that it's often more usable to have the interface
permit only valid input to be expressed*--rather than
accepting anything and emitting error messages later--but
remember that you need to leave the server-side verification
in place for security.
Clayton probably knew this, but client-side error
checking is risky on it's own since
everything running on the attacker's machine is under
his control, including his copy of your HTML and javascript.
* the
exceptions that prove the rule
| [reply] |
|
Hi Clayton,
Thanks for the response. Although my question was about about error handling in general, I'm happy to discuss this specific case some more. In general I agree that's better to check the user input early on. In this case, I actually considered using 3 drop down menus to force the user to select a valid date. I chose not to do this for two reasons: 1. It was going to take longer to code. :) 2. Based on my own experience, it's faster just to type in the date than to take my hands off the keyboard to do three mouse-menu selections. So I felt like it was more user friendly (Although I realize with my solution they have the possibility of dealing with an error page which won't need to happen in the other case.) I don't think there is a security issue in this specific case-- either someone is going to input a valid date or they are not, and if they don't, I catch it.
I still could have chosen to do the error checking in perl before it went in the database. Checking for YYYY-MM-DD is easy. If that was all there was to it, I probably would have done that. I also need to check to see if it's a valid date, which is not so easy. To do this I would have loaded Date::Calc at a slight speed penalty. So I thought my solution was good because it relatively quick to code and easy to maintain, it executed quickly and it was safe from a security standpoint.
I agree that the solution wouldn't "scale up in well" in the sense that I don't want to be greping DBI::errstr to check for user input problems all the time. I'm more interested in the question of "In this case, I have a standard error I want to return. I want it to be easy to cause the error to be thrown by default, but I want the error to be handled a high level. How do I do that?"
-mark
| [reply] |
Re: The art of error handling (clarified)
by markjugg (Curate) on Dec 19, 2000 at 22:36 UTC
|
Thanks for the responses everyone.
To clarify the situation I'm struggling with: The case I'm interested in is when I have a nice friendly error that I want to make available when there is a problem, but I don't want to force it on the user, even though it's probably the Right Thing to happen most of the time. It seems like populating a global variable or method would be one good way to do this, in conjunction with a system that could through the error by default (how DBI does things) but there are other good ways I'm sure.
-mark | [reply] |
|
|