Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

The art of error handling

by markjugg (Curate)
on Dec 19, 2000 at 21:36 UTC ( [id://47426]=perlquestion: print w/replies, xml ) Need Help??

markjugg has asked for the wisdom of the Perl Monks concerning the following question:

Now that I'm becoming a wiser and more experienced perl hacker, I'm learning that good error handling systems are a bit of an art form. The book "Practice of Programming" recommends to do something like "detect errors at a low level, deal with them on a high level". This is a philosophy I can appreciate. To illustrate my discussion, here's a error handling subroutine that could be used by someone checking the return values from a Postgres insert from a CGI form:
sub is_pg_date_error { my $key = shift; my $value = param($key); if ($DBI::errstr =~ m/Bad date external representation '$value'/o) + { error(title=>'Inproper date format', msg=>"There is a problem with $key field. Please return an +d format it in the format YYYY-MM-DD and confirm that it is a valid da +te.") } }
This is probably bad style, because it detects the error it a low level (where this routine is hidden away), but it doesn't allow the error to handled at a high level-- it throws an error itself immediately. The issue here is that most of the time I do want to throw this error. So I want to be able to handle the error at a high level, but for it happen automatically if I want to.

Looking at how CGI.pm and DBI.pm deal with errors, they both have similiar approaches which are inline with the "Practice of Programming" principles-- when there is an error, they populate some global routines in their package with the relevent errors. You can choose to do something with these errors or just ignore them.

So, when developing custom modules, would recommend the same error handling scheme in general? That is, when there is an error, instead of dealing with directly or returning it to user, stick it into a package global which the user can manipulate on their own? If not that, what other insights on the art of error handling do you have? Thanks!

-mark

Replies are listed 'Best First'.
Re: The art of error handling
by mikfire (Deacon) on Dec 19, 2000 at 22:06 UTC
    The short fall of mirod's scheme is that is usually only one right answer, but there infinite wrong answers. For example, if I write some code that opens a file for reading, many things can go wrong. The file might not exist, the file might not be readable by the current user, the file might be empty, etc. If you are trying to give meaningful errors to the calling routines, you have exhausted the 0 or undef on failure option. All the calling code can do is say "Yup, something bad happened".

    This gets even more complex when the low level routines are interacting with system(). To continue the example above, lets say the subroutine opens a file, reads a command from it and executes it via system() ( yes, security sucks, but this is an example ). You have not only the problems listed above, but now you need to worry about the error codes generated by the system() call as well.

    My general method ( I am contemplating shining this up and attempting to present it at YAPC ) is I have a module that provides a series of simple functions ( Debug, Warn, Error, Usage, Notify ), a series of control variables ( don't print the message, die when Error is called, etc. ) and a whole slew of constants. The constants are defined in a hash, and the import routine makes constant functions of them and shoves them into the caller's space. There is one special value - NO_ERROR which is numerically 0. I use this since system() returns 0 on success ( it has some other interesting side affects ). The defined error codes are negative and any system() call returns the correctly left shifted response on error.

    This allows the calling functions to determine if the error was internal ( eg, "Before I can do x, you need to give me y and z" ) or external ( eg, "That command does not exist" ). With a few more functions, you can even turn the numeric code back into something a bit more human readable.

    Returning useful error codes is difficult at best. This is a long-winded statement of my current solution to this problem in the hopes it may help you. Sorry for the lack of code. If enough interest is given, I may clean the module up enough for the Catacombs.

    Mik
    mikfire

      I agree my method is not perfect, I'm just saying that I try to use it as often as possible. It also makes me write simple, atomic functions that do only one thing and that return either a valid result or a single failure. And the calling code most of the time dies in case of failure.

      I find it very rare that a function should fail but the calling code will behave differently depending on the error

Re: The art of error handling
by mirod (Canon) on Dec 19, 2000 at 21:47 UTC

    I usually die when either the error is obviously fatal or is not supposed to happen at all (if it happens then there is a bug in the program).

    Otherwise I return a false (0) or undef value. False is nicer to test than undef but it is not always possible as 0 might be a valid returned value.

    If undef is also a valid returned value then I am in trouble ;--) I try to code around it though, making it an illegal result.

      Since you mentioned CGI (thus, CGI app writing), it is extremely important to return some error code and message to the user/browser (this is called "user-friendliness"). This is not so intuitive until you use CGI::Carp or better yet, you own personalized error message system which fits in with the theme of the rest of the site. The problem I see with global variables is that they become to cluttery and are named differently for every module. I find function return codes far more useful and clear. Plus, you can put the whole shebang in the same statement!
      die "Oops!\n" unless evilfunction(); #or "if", if you prefer
      I suppose the use of the global vars carried over from C when you were basically stuck returning static variables for multi-variant-byte length information. Of course, Perl does not have this restriction and thus would recommend using returns. My favorite scheme is 0 for success, some string or number on failure- preferably something I might be able to output to the user. This is definitely that something that has come with me from my UN*X C background (POSIX likes to make success 0 now).
      AgentM Systems nor Nasca Enterprises nor Bone::Easy nor Macperl is responsible for the comments made by AgentM. Remember, you can build any logical system with NOR.
        Since you mentioned CGI (thus, CGI app writing), it is extremely important to return some error code and message to the user/browser (this is called "user-friendliness"). This is not so intuitive until you use CGI::Carp or better yet, you own personalized error message system which fits in with the theme of the rest of the site.
        But don't be too friendly. Never tell anything about why an error happened. Just say "something broke, we already know about it", no matter what went wrong. Do not say "database error" or "invalid URL" or anything that reveals any kind of nature about why.

        Yes, it's human nature to want to know (and to reveal), but it's a evil person's nature to use that information to reveal potential breakin paths. So don't. Just say no to "helpful error messages".

        -- Randal L. Schwartz, Perl hacker

        My personal opinion is to never, ever, ever die when using CGI.

        The most annoying thing to me (as a web user) is when something doesn't work and I'm just presented with an error. That pisses me off more than anything else. If I get an irrecoverable error (very, very rare), I want to see an error page that gives me options for notification, ETA on fixes, or other things I can do with my data that I just spent 15 minutes filling out.

        And just to digress for just a moment, 404 pages suck. If the page isn't found, it's not there, but just don't leave me hanging, give me options or redirect somewhere. I went once to a lecture on usability, and while the lecture sucked, the speaker made one good point; that is that as programmers, we spend far too much time just quitting out of apps when something goes wrong. More care needs to be taken in handling the error. Sending an error message back to the user doesn't actually "handle" the error. It just passes it on for the end-user to handle, often times with no options whatsoever.

        The system I have tried to use across the Web site I control, and which I have found particularly useful for DBI work, is:

        return($success_ref, $error_ref);

        Each subroutine returns either

        $dbh->prepare($query) || return (0, "Error: " . $dbh->errstr);

        or

        return ($array_ref, 0);

        You can pepper these throughout your code anywhere there's a DBI action able to fail and it ensures that you can fail somewhat gracefully.

        It's not perfect, but it also makes error-checking easy (just test whether $errors contains anything) and allows you to develop a cascading series of calls where it doesn't bother executing subsequent queries if one has already failed.

      But if you return undef or 0, the error is unspecific, it just means there was a failure... how do you address the specifics? I have been struggling with error checking myself, so I am quite curious what responses will appear.

      Update: By the way, the way that I have been handling errors in modules is to have a sub called error() which takes a string, stores it in an ERRORS array in the package, and return undef, so I can have code like this:
      open(TEST, ">$file") or return $self->error("Failed to open $file");
      That way, I get the undef return value, and store the error in the array in one shot. There's probably a few better ways, though.

      Hot Pastrami

        If you use subroutine/methods which perform simple operations then the calling code usually either ignores the error, fall back to another method ( $val= $obj->method || default) or dies quickly.

        So I usually handle errors at the level just above the method/subroutine where it happens. Not quite the "detect at the lowest level and at the highest level" philosophy, I admit, but it usually works for me.

Re: The art of error handling
by turnstep (Parson) on Dec 20, 2000 at 04:54 UTC
    For the particular example CGI/DBI, I use a custom error-handling subroutine that does all the work for me. The only thing I have to pass (usually) is a simple error code, and perhaps the offencding SQL. For example:
    my $dbh = DBI->connect($foo,$bar,$baz,{AutoCommit=>1}) or &Error(1); my $sth = $dbh->prepare($SQL_TEXT) or &Error(2,$SQL_TEXT);
    The idea is to pass as little as possible, and let the subroutine deal with it. $DBI::errstr is already global, so it is not passed. The handy dandy caller function tells us exactly where this error came from. The error function then can react different ways, depending on the situation. For most users, it will display a "we are having probems with the database" message, log the error internally somewhere, and exit there (after some quick cleanup, e.g. $dbh->disconnect). For users that are markd as a "developer", a different error message is returned, this one usually has at least $DBI::errstr, the file and line number where it occured, and additional SQL used (formatted for the browser in pretty colors, of course, etc.) One of the biggest advantages to such an approach is the ease in which you can change things - say you want users to see a different message - voila! change it in one place, your error handling routine.

    More specific errors, such as the wrong format in the field, should be caught before they go into the database, IMO. Javascript with a safety net of some simple perl checks go a long way towards making this happen. Avoid ever feeding a database "bad" data, as your program will have to rely on the DB's error message to figure out what is going on.

    As far as making custom modules, I am a big fan of setting globals. In the case above, it really really helps that DBI sets $DBI::errstr and let's ME (e.g. the script) deal with it how I see fit. Sometimes I use the information inside it, sometimes I don't. I prefer methods that return simply true or false, then set a global.

      Thanks for the note turnstep,

      I use an error handling system that's very similiar to what you describe. In has input that looks like this:

      %in = ( title => undef, msg => undef, debug => 0, html_action =>1, log_action => 1, death => 1, @_ );
      By default it returns a message in HTML to the user with the title and msg, but you can also turn on debugging output for programmers, and optionally log the error someplace if you want. It differs from your system in at least one significant way:

      Your error handling routine apparently stores the full error messages someplace, so error(1) would expand into something meaningful. This has the drawback that another programmer (or even a forgetful self) would have to go look up the code to see what error was being thrown. This has the advantage of producing short code, and not repeating the same error message, but it's a bit more obfuscated. In my system, the title and message is required for every message, which makes it clear what's happening in the main code line, but it does appear more cluttered, and I end up copying and pasting common errors around. I was thinking of developing a comprise solution: I would just require the title field to be passed. If no msg attribute was present, it would look for default msg in an error hash, with the title as the key. I like this because the key is now a short phrase that will hopefully have some meaning to humans, and be more memorable than a number. However, it allows a succint error handling syntax, and the ability to reuse the text of common error messages.

      -mark

Re: The art of error handling
by Hot Pastrami (Monk) on Dec 19, 2000 at 22:19 UTC
    On this subject, I have seen code where the developer used prototyped subs to mimic try/catch exception handling:
    try { open(TEST, ">$file"); }; catch { push(@errors, "Failed to open $file"); return undef; };
    This basically eval()s the try block, and the catch checks the value of $@. Is this good style? I would think that the eval() overhead would be prohibitive, that is why I have avoided using it.

    Hot Pastrami

      It's not particularly good style, IMO. The reader has to constantly remind themselves what your particular definition of try and catch do (in particular, what their side-effects are).

      Such temptations exist in other languages as well ... have you ever had to maintain e.g. C code where the original programmer decided to #define their own aliases for standard C keywords, perhaps in imitation of some other language?

      eval {} and if ($@) {} may not be quite as attractive, but it is immediately clear to anyone with appreciable perl experience what they do (including side-effects). Fewer bytes too. ^_^

Re: The art of error handling
by clscott (Friar) on Dec 20, 2000 at 04:36 UTC

    I think you're checking the user input at the wrong point if you're relying on the DBI::errstr to tell you that user input is invalid.

    From a security and usability stand point you should test your data long before you try and dump it into the database. I think the security part stands on it's own without further explanation.

    The earlier you detect a problem the easier it tends to be to deal with it. What if this program routinely took 10 or 15 minutes to get to the point where the insert happens and DBI::errstr is populated? Your users will be upset.

    Better options for detecting and reporting errors in web apps (in order of increasing ease of user interaction):
    • Check the format with a Javascript and pop up an alert box with the error message "Format for the DATE must be YYYY-MM-DD" Then use the javascript to set the focus to the appropriate form field (provided the browser supports it and is turned on yada yada)
    • Build your web form so that it is impossible for the user to enter the incorrect date. This is easier on you as a programmer and easier on your users. Use 3 select boxes: one for the year, one for the month and one for the day.
         Year     Month       Day
         [2000|v] [January|v] [26|v]
         
      (bad ascii rendering as form and table tags are filtered) You get back three values that you know are acceptable and can be concatenated into a good looking date.
      $date = $param{year} . "-" . $param{month} . "-" . $param{day};
      Now that format is covered (and users aren't frustrated) use a module like Date::Manip to check if a date is _valid_ (i.e. not 2000-02-31 aka Feb 31st)
    (I actually skipped one option, but it relies on your application using a really good templating system and being one cgi with a lot of conditionally generated html: It's process the page on the server and return it with the offending fields highlighted and the error message above or below the field explaining the problem. The top of the page should declare that there is a problem and to pay attention to the indicators yadda yadda yadda.)

    If you continue down the road you're on now are you really prepared to grep DBI::errstr for every partial error string that could be tossed from within the database?

    --
    Clayton
      It's true that it's often more usable to have the interface permit only valid input to be expressed*--rather than accepting anything and emitting error messages later--but remember that you need to leave the server-side verification in place for security.

      Clayton probably knew this, but client-side error checking is risky on it's own since everything running on the attacker's machine is under his control, including his copy of your HTML and javascript.

      * the exceptions that prove the rule

      Hi Clayton, Thanks for the response. Although my question was about about error handling in general, I'm happy to discuss this specific case some more. In general I agree that's better to check the user input early on. In this case, I actually considered using 3 drop down menus to force the user to select a valid date. I chose not to do this for two reasons: 1. It was going to take longer to code. :) 2. Based on my own experience, it's faster just to type in the date than to take my hands off the keyboard to do three mouse-menu selections. So I felt like it was more user friendly (Although I realize with my solution they have the possibility of dealing with an error page which won't need to happen in the other case.) I don't think there is a security issue in this specific case-- either someone is going to input a valid date or they are not, and if they don't, I catch it.

      I still could have chosen to do the error checking in perl before it went in the database. Checking for YYYY-MM-DD is easy. If that was all there was to it, I probably would have done that. I also need to check to see if it's a valid date, which is not so easy. To do this I would have loaded Date::Calc at a slight speed penalty. So I thought my solution was good because it relatively quick to code and easy to maintain, it executed quickly and it was safe from a security standpoint.

      I agree that the solution wouldn't "scale up in well" in the sense that I don't want to be greping DBI::errstr to check for user input problems all the time. I'm more interested in the question of "In this case, I have a standard error I want to return. I want it to be easy to cause the error to be thrown by default, but I want the error to be handled a high level. How do I do that?"

      -mark

Re: The art of error handling (clarified)
by markjugg (Curate) on Dec 19, 2000 at 22:36 UTC
    Thanks for the responses everyone.

    To clarify the situation I'm struggling with: The case I'm interested in is when I have a nice friendly error that I want to make available when there is a problem, but I don't want to force it on the user, even though it's probably the Right Thing to happen most of the time. It seems like populating a global variable or method would be one good way to do this, in conjunction with a system that could through the error by default (how DBI does things) but there are other good ways I'm sure.

    -mark

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://47426]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (7)
As of 2024-04-24 08:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found