in reply to Is validation of all subroutine arguments overkill?

"Though I'm sure it makes my code is more robust, I'm wondering at what cost."

The cost is time and extra effort, but the payoff is easier to troubleshoot, more robust and easier to update code.

Not only that, but it's experience for later projects; you get a very good understanding and knowledge of when you need to do deep validation, and when only a light skim will do. Not only that, but with good validation, your test suite can be (sometimes slightly) simpler... less edge cases to test against. You can for() loop a boatload of bad params at a sub in a test, and get back the same error/warning etc. if you've validated/checked them in the actual code. Get familiar with dies_ok() and throws_ok() from Test::Exception, and tricks to catch warnings, eg:

{ my $warning; local $SIG{__WARN__} = sub { $warning = shift; }; Module::do_something($param, $param1); like ($warning, qr/ignoring/, 'param 1 isn't an href, ignored'); }

Keep up the good work. The extra effort put in up front will pay off later when writing new projects, and when you come back to existing ones to make updates/fix bugs etc.

In my larger distributions, I typically use a named param public interface:

sub blah { my %params = @_; ... }

...called like:

my $farm = blah(moo => $href, meow => "Phoebe's smelly cat");

...along with some sort of _validate_params() function/method, and sometimes a public function so a user can fetch all the valid params if desired. Here's a really basic example.

sub valid_params { my %valid = ( moo => 1, woof => 1, meow => 1, ); return %valid; } sub _validate_params { my %p = @_; my %valid = valid_params(); for (keys %p){ die "$_ isn't a valid param" if ! exists $valid{$_}; } if (exists $p{moo} && ref $p{moo} ne 'HASH'){ die "param 'moo' must be a hash reference"; } }

Then in each sub that accepts params:

sub yell { my %p = @_; _validate_params(%p); ... }

Also keep in mind that the CPAN has some good modules for specific validations. Email::Valid for email addresses, Data::Validate::IP for v4/v6 IP addresses etc. Some of these validation distributions have been written by Perl heavy-hitters (the two I've mentioned certainly have), so if you've got a common type of param, do a quick search to see if someone has already written a validator for it. Sometimes the extra weight of an external module is worth it, sometimes it's not, but it's good to know these things are available if desired.

Replies are listed 'Best First'.
Re^2: Is validation of all subroutine arguments overkill?
by haukex (Archbishop) on May 21, 2016 at 14:14 UTC

    Hi stevieb,

    Testing for warnings is a good idea, and ++ for the rest of your post, I just wanted to point out that I got burned by code similar to what you posted: You can't know how many warnings a piece of code will produce, especially over various Perl versions. Example: this CPAN Testers failure is caused by my test code expecting an exact number of warnings, but Perl 5.6 produced an extra warning in this case. Here's the code I currently use (in addition to Test::Fatal):

    sub warns (&) { my $sub = shift; my @warns; { local $SIG{__WARN__} = sub { push @warns, shift }; $sub->() } return @warns; } # example usage use Test::More tests=>2; my @w = warns { is some_func(), "bar"; }; is grep({/\bfoo\b/} @w), 1; sub some_func { # could be any number of other warnings here warn "foo"; return "bar"; }

    Regards,
    -- Hauke D

      Yes, thanks for pointing that out, and in fact I usually do implement it with push() to an array. I'm in slow-mo this morning as it's a dreary Saturday to start a long weekend :)

      I'll keep my example as is so that future readers have context for your post, and can see the potential pitfall.

Re^2: Is validation of all subroutine arguments overkill?
by RonW (Parson) on May 23, 2016 at 21:43 UTC
    less edge cases to test against

    Maybe

    Testing the boundaries checked by the parameter validation is a form of edge case testing that should be done. For whatever reasonable value of epsilon, you need to validate that values just "inside" the boundaries are accepted and those just "outside" are rejected.