Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

What Are The Rules For Subs And Modules When Fed Wrong Input

by Cody Pendant (Prior)
on Jul 21, 2002 at 03:36 UTC ( [id://183747]=perlquestion: print w/replies, xml ) Need Help??

Cody Pendant has asked for the wisdom of the Perl Monks concerning the following question:

I think the title says it all, but:
  1. if I've written a sub or module which expects, say, three variables to be passed to it, what should it do when it gets two?
  2. What if it gets four?
  3. What's the correct position on whether it should die, warn, or just return an error message?
  4. How detailed should the error message be? Should I say:
    1. Wrong number of parameters! I need three!
    2. Wrong number of parameters! Correct syntax is my_sub($name,$phone,$address)
    3. Wrong number of parameters! Correct syntax is my_sub($name,$phone,[$address|@address]) -- for instance, if one of the parameters can be an array or a scalar
    4. Wrong number of parameters! Correct syntax is my_sub($name,$phone,$address)
    5. Wrong number of parameters! See documentation!
Thanks.
--
($_='jjjuuusssttt annootthheer pppeeerrrlll haaaccckkeer')=~y/a-z//s;print;

Replies are listed 'Best First'.
Re: What Are The Rules For Subs And Modules When Fed Wrong Input
by dpuu (Chaplain) on Jul 21, 2002 at 03:51 UTC
    Obviously, it depends...

    The first thing to say is that, if you are using sub is known at compile time (e.g. its not an object-method), then you should probably use prototypes: then you get the errors, not your users:

    sub foo ($$) { print "@_\n"; } foo(1,2,3); % perl foo.pl Too many arguments for main::foo at foo.pl line 2, near "3)" Execution of foo.pl aborted due to compilation errors.
    If you want to do run-time checking, then you should consider that it may well be end-users who see the errors, not the person writing the script. In this case, I'd use the following guidelines:
    • If possible, have default values for params not supplied -- then its not an error
    • If you really want to die, then you should provide an error message that tells the user that its not their fault:

      Internal Error: The script you are running has found an error made by its programmer, and regrets that it is unable to continue. Please email (the developer), and include the following information (... stack dump ...). Please accept our appologies

Basically, detailed messages probably won't help the user, so don't confuse them. --Dave.
      sub foo ($$) { print "@_\n"; } foo(1,2,3); >>> Too many arguments for main::foo at ...
      Could you please explain that in a little more detail? I'd really appreciate it, as I don't know what's meant by "prototypes".

      But my concern is about what to do if I'm writing a module, not if I'm using a module. Perhaps the real answer is, write good and unambiguous documentation?
      --

      ($_='jjjuuusssttt annootthheer pppeeerrrlll haaaccckkeer')=~y/a-z//s;print;
        When you define a sub, you can tell the compiler what args it expects. In my example, the ($$) means that it must be passed two scalar values. Rather than me going into more detail myself, checkout this tutorial. Another view is given by Tom Christiansen, here. Also, don't forget the Camel book. --Dave.
      Is there a simple way to print the call stack w parameter values? I could use this a great deal.
        check out Devel::DumpStack --Dave.
Re: What Are The Rules For Subs And Modules When Fed Wrong Input
by perrin (Chancellor) on Jul 21, 2002 at 05:03 UTC
    You should die so that the person who wrote the incorrect code can find out about the error. Since it's a module, use the Carp module's croak() method instead of die, and that will improve the value of the error message. Also, be sure to document this behavior so that users of your module have an opportunity to catch the error with an eval block if they need to.
Re: What Are The Rules For Subs And Modules When Fed Wrong Input
by dws (Chancellor) on Jul 21, 2002 at 04:15 UTC
    If I've written a sub or module which expects, say, three variables to be passed to it, what should it do when it gets two?

    Consider CGI.pm as an exemplar. Return undef if an argument is missing and you can't substitute a reasonable default. die as a last resort.

Re: What Are The Rules For Subs And Modules When Fed Wrong Input
by rob_au (Abbot) on Jul 21, 2002 at 09:41 UTC
Re: What Are The Rules For Subs And Modules When Fed Wrong Input
by flocto (Pilgrim) on Jul 21, 2002 at 09:18 UTC

    When writing objects I usually set an global variable $ERROR upon failure and return undef. $ERROR is then made available via an a method. This way I can have the script decide if it can live with that. An example code in the script could be the following, which is what I normally do.

    my $mod = Module->new () or die Module->error (); my $foo = $mod->method () or warn $mod->error ();

    Remains the question: Is more arguments than expected an error? Well, I usually ignore unneccessary arguments, but this too depends on the context. Another way I'm using (especially in constructor methods) is passing hashes and then check for existing keys. So in general I'd say it depends on the context, but I wouldn't let an object kill the entire program.

    Regards,
    -octo

      I would say that the idea of an $ERROR variable is good in concept (similiar to the perl variable $!). However, if this is in a module then I think abstraction should be the key.

      This way I can have the script decide ...

      It's important, I agree, to let the script decide; however, the script shouldn't have to handle the code to decide.

      For example, there could be a warn method in your module (or error, etc.), that accepts a variable and then chooses the appropriate level to warn the calling program (or to die() itself).

      One benefit to this approach is that you can easily have the warn level in your program localized -- and contrary to handling it all yourself, changing what it is would be as simple as changing a parameter.

      This, of course, assumes that the module takes an OOP approach (which I'm biased towards myself).

      I agree with the above error-handling technique. I hate to have a module die on me, ever. I also hate wrapping every call to a method in eval { } to assure that it doesn't die, and I want my logic to be able to know when there's been an error, not my user. The only points I would add would be to use an error method rather than variable, which gives the author / user some control (if they wish) over how the error is reported, and to make all methods return undef() on failure so that the person utilizing the module has a consistant activity for determining if an error occured. Any method that doesn't return data should return true (1), so that the user could say :

      if($object->method) { .. } else { my $error = $object->error(); }

      e.g.:

      sub foo { my($self,$arg) = @_; if( !defined($arg) ) { $self->error("[mymodule::foo] Required Argument, ARG, not supplied +.\n"); return(undef); } else { return(1); } }

      As a note for configurable error-handling methods, you could do something like this:

      sub error { my $self = shift; if( defined($_[0]) ) { $self->{'Error'} = $_[0]; if( $self->raise_error() ) { warn("$_[0]\n"); } if( $self->die_error() ) { die("$_[0]\n"); } } return($self->{'Error'}); }

      So there, you have a method that can be used both as an private method (for setting the error) and a public method (retrieving the error), with the ability for the user to configure how they get their errors back, as a scalar, a call to warn, or a call to die.

      Now, you'd also want to give them a method for configuring that activity, you could either do it via options when creating a new instance of the module, or you could give them method(s) like the following, which also handle the lookups for the error() method:

      sub raise_error { my $self = shift; $self->{'Raise'} = $_[0] if( defined($_[0]) ); return($self->{'Raise'}); }

      There, now you've got these methods that have both public and private interfaces, that effectivly handle whatever the user desires, but by default (you could set them to) require checks on return values and lookups with a method...


      !c
Re: What Are The Rules For Subs And Modules When Fed Wrong Input
by fsn (Friar) on Jul 21, 2002 at 11:06 UTC
    I have always felt that the compiler/interpreter should give as much information as possible when it finds a syntax error. Nothing is more irritating than just getting a "Syntax error somewhere in your script". Therefore I think 4a and 4e are bad, because they are equvialent to a rude RTFM. A PerlMonk might, and in some cases should, say RTFM when a too simple question is asked, because answering the same thing over and over again is tiring. A compiler/interpreter/syntax checker never get's tired. In my opinion, 4c and 4b are excellent.
Re: What Are The Rules For Subs And Modules When Fed Wrong Input
by dsheroh (Monsignor) on Jul 21, 2002 at 22:12 UTC
    Return an error value. If it's really, really important, write a message to STDERR or a log file (not STDOUT!). Library code should never, never, NEVER die on any error condition short of the CPU bursting into flames (and even then it's questionable). What you consider to be a fatal error may be a normal condition for the programmer using your library/module. (Yes, I've had the supreme displeasure of using a library which "helpfully" terminated my programs whenever an error occurred without giving me a chance to detect and correct the situation, or at least perform a controlled shutdown.)

    Oh, and another thing re: the examples under question 4 - the user should never be subjected to detailed information about the programmer's mistakes. It will just confuse them. (*ring* *ring* "Hello, tech support." "Your program just said, 'Wrong number of parameters! Correct syntax is my_sub($name,$phone,$address)', but I filled in all the blanks on the form. And WTF does 'my_sub' mean?")

      I disagree. Die. Always die. Die on everything except success.

      The reasoning behind this is simple. Its easy to handle an exception, you can catch it, you can ignore it, you can re-throw it. All of this without cluttering the interface with random error variables and having to test returns at every single operation.

      Exceptions didn't catch on for no reason, they are an elegant and effective method of error handling and indeed I find myself having a very hard time writing software in languages that don't support them.

      Let us take the example of a database error for example. If the DBI module threw useful exceptions when things went wrong, I would no longer have to sit there checking the return on every little operation to make sure things went ok. I could wrap my entire atomic operation in an eval {}, catch any exceptions I found relevant and re-start the entire operation or provide *useful* input to the user if necessary. Easy, clean, effective.

      Think twice before dissing die(), its a big strength. That said, no interface should die() without the documentation stating what exceptions are thrown and when :)

        If the DBI module threw useful exceptions when things went wrong, I would no longer have to sit there checking the return on every little operation to make sure things went ok. I could wrap my entire atomic operation in an eval {}, catch any exceptions I found relevant and re-start the entire operation or provide *useful* input to the user if necessary.
        Strangely enough you can do that. It's just a matter of reading through the documentation (1.30 here, 1.19 here).

        You can use the RaiseError, PrintError (and on later DBI versions) HandleError database handles to pull this off. If you download Tim Bunce's Advanced Perl DBI slides there is more information.

        Hope this helps...

        gav^

Re: What Are The Rules For Subs And Modules When Fed Wrong Input
by tame1 (Pilgrim) on Jul 22, 2002 at 04:08 UTC
    Cody,

    One of the things I use to get around things like this is to use anonymous hashes to call subroutines. For example:
    my ($value,$error) = my_sub(input => 'who', inny => 'what'); unless ($value) { my_abort ("There was an error: ",$error); }
    and then, in my_sub
    sub my_sub { my %args = { input => undef, inny => undef, @_ }; return(undef,"Missing inny") unless $args->{'inny'}; # do some stuff if ($problem) { return(undef,$error_msg); } return($value,undef); }
    As you can see, with this you can test for each input requirement by name, dieing on the "gotta-have-its" and just warning on the "ok-I-dont-need-its".
    The my_abort could even be changed to check for error severity and do a warn or die based on that.

    Anyhow, just my 2 cents on handling sub calls that may return errors.

    What does this little button do . .<Click>; "USER HAS SIGNED OFF FOR THE DAY"
      There's a small typo in the code example:
      sub my_sub { # my %args = { # Was... my $args = { # Should be... input => undef, inny => undef, @_ };
Re: What Are The Rules For Subs And Modules When Fed Wrong Input
by ichimunki (Priest) on Jul 22, 2002 at 18:14 UTC
    For your private code you should use die. To quote the Pragmatic Programmers, "Crash Early: A dead program normally does a lot less damage than a crippled one." If you are writing OO code, consider using Carp.pm and croak() or confess() so that the programmer gets a better idea of what went wrong (stack trace, etc).

    For your public code you should also die. Although in some cases you may want to wrap that in some sort of eval to provide services to your users similar to the way CGI can be asked to print errors to the browser and DBI will not crash a program because of a SQL error (although maybe it should).

    If someone does not want their program to die as a result of misusing an API, they can wrap their code in an eval.

    Yes, you could return a false or some other error string, but then I have to write each call to your module as $foo->bar() or something; or as $foo->bar(); if $foo->err { something; }. In any case it's going to take just as long to track errors whether you die or try to handle the failure. But why make work for the script(er)? If the programmer absolutely cannot handle a failure they can catch it using $SIG{__DIE__}, I believe (having never felt the need to try this myself, I rely only on what I read at 'perldoc perlvar').

Re: What Are The Rules For Subs And Modules When Fed Wrong Input
by Cody Pendant (Prior) on Jul 21, 2002 at 03:38 UTC
    Please ignore 4 (d). Cut-and-pasto.
    --
    ($_='jjjuuusssttt annootthheer pppeeerrrlll haaaccckkeer')=~y/a-z//s;print;
Re: What Are The Rules For Subs And Modules When Fed Wrong Input
by Cody Pendant (Prior) on Jul 22, 2002 at 04:49 UTC
    Thank you all for your input, contradictory as it sometimes was. It's all pretty academic for me -- for the time being, as I won't be submitting modules to CPAN any time soon -- but it's great to get the input of the community.
    --
    ($_='jjjuuusssttt annootthheer pppeeerrrlll haaaccckkeer')=~y/a-z//s;print;
Re: What Are The Rules For Subs And Modules When Fed Wrong Input
by kschwab (Vicar) on Jul 22, 2002 at 17:35 UTC
    I asked a similar question in a node called Exceptions and Return Codes.

    I didn't seem to get a consensus...but there's lots of quality replies there.

Re: What Are The Rules For Subs And Modules When Fed Wrong Input
by Necos (Friar) on Jul 23, 2002 at 21:40 UTC
    I believe, like a few others, that the error handling should depend on _WHAT_ the function is doing. For example, if you are trying to do a system call that fails, you should die (or at least set $! to the error). More importantly, you should document your module/sub to explain what the error handling procedures are. Here's how I feel error handling should be "handled":

    1.) In the documentation, be very clear about what must be included, and which parameters are optional. To take an example, in Tk, you don't need to set the -background and -foreground properties of a window/frame/etc. It automagically falls on a default value. Let your users know this ahead of time. If you are handed more arguments than you need, then say that extra parameters will be discarded. There are basically two ways to handle this:
    my $param1 = shift; my $param2 = shift; my $param3 = shift;
    or
    my ($param1, $param2, $param3) = @_;
    I personally like the first, even though it is expensive. Pick whichever method you like and stick with it.

    2.) I usually die when I have a fatal system error (as mentioned above), warn when I have a not-so-fatal error (say a log file is not available, but everything went fine), and return when there is something valuable to return. In a module I'm writing for my job, I do something like this:

    a.) If I can not open the data source file (DB, text file, etc.), I die. There is no point to go any further as it will propagate more errors later on. Not only that, but they will be much harder to understand.

    b.) If I can not open a log file, but everything is working, I warn. The code was written such that everything will be known to be going fine before the log is written. If anything has gone wrong before the logging code is reached, a die will occur. Again, no use propagating errors that are fatal.

    c.) If there are no values (say, from a search) to return, I return an empty string ''. That way, someone can just check the length of the return value (or ref) to see if they got any useful data and do something with it.

    3.) Error messages should be detailed enough such the user of such code knows why the error occured, and what the error was. Again, good documentation can save you a lot of thinking about how detailed error information should be inside of the module. You could simply say:

    "This function will return 0 on failure." or "This function will die if a system error occured."

    It all depends on how fatal you think the error is. Will it propagate more confusing errors? Then die. Do you want to just continue? Then warn. If you want to just return an error message or numerical status, then return.

    Documentation is your best friend. If you use it well, then your error handling becomes trivial (or close to it).

    Theodore Charles III
    Network Administrator
    Los Angeles Senior High
    4650 W. Olympic Blvd.
    Los Angeles, CA 90019
    323-937-3210 ext. 224
    email->secon_kun@hotmail.com
    perl -e "map{print++$_}split//,Mdbnr;"
Re: What Are The Rules For Subs And Modules When Fed Wrong Input
by Stegalex (Chaplain) on Jul 22, 2002 at 01:10 UTC
    Is the NodeReaper off tonight? Just wondering. ++ to everyone!

    ~~~~~~~~~~~~~~~
    I like chicken.
      It's valid perl question, all be it that it has been covered a couple of times... Even relativly experienced programmers (like me) like to be reminded of the many different ways perl can handle things... :)

      ----
      Zak
      "There is no room in this country for hyphenated Americanism" ~ Theodore Roosevelt (1915)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://183747]
Approved by ignatz
Front-paged by RhetTbull
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (5)
As of 2024-03-28 15:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found