Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Exploring Type::Tiny Part 1: Using Type::Params for Validating Function Parameters

by tobyink (Canon)
on Jul 30, 2018 at 18:49 UTC ( [id://1219503]=CUFP: print w/replies, xml ) Need Help??

Type::Tiny is probably best known as a way of having Moose-like type constraints in Moo, but it can be used for so much more. This is the first in a series of posts showing other things you can use Type::Tiny for.

Let's imagine you have a function which takes three parameters, a colour, a string of text, and a filehandle. Something like this:

sub htmlprint { my %arg = @_; $arg{file}->printf( '<span style="color:%s">%s</span>', $arg{colour}, $arg{text}, ); }

Nice little function. Simple enough. But if people call it like this:

  htmlprint( file => $fh, text => "Hello world", color => "red" );

... then they'll get weird and unexpected behaviour. Have you spotted the mistake?

Yes, "colour" versus "color".

So it's often good to perform some kind of checking of incoming data in user-facing functions. (Private functions which aren't part of your external API might not require such rigourous checks.)

Let's see how you might do that in Perl:

use Carp qw(croak); sub htmlprint { my %arg = @_; exists $arg{file} or croak "Expected file"; exists $arg{text} or croak "Expected text"; exists $arg{colour} or croak "Expected colour"; $arg{file}->printf( '<span style="color:%s">%s</span>', $arg{colour}, $arg{text}, ); }

But of course, this is only a bare minimum. We could go further and check that $arg{file} is a filehandle (or at least an object with a printf method), and that $arg{text} and $arg{colour} are strings.

use Carp qw(croak); use Scalar::Util qw(blessed); sub htmlprint { my %arg = @_; exists $arg{file} or croak "Expected file"; exists $arg{text} or croak "Expected text"; exists $arg{colour} or croak "Expected colour"; ref($arg{file}) eq 'GLOB' or blessed($arg{file}) && $arg{file}->can('printf') or croak "File should be a filehandle or object"; defined($arg{text} && !ref($arg{text}) or croak "Text should be a string"; defined($arg{colour} && !ref($arg{colour}) or croak "Colour should be a string"; $arg{file}->printf( '<span style="color:%s">%s</span>', $arg{colour}, $arg{text}, ); }

Suddenly our nice little function isn't looking so little any more. Type::Tiny and friends to the rescue!

Type::Tiny comes bundled with a module called Type::Params which is designed for just this sort of thing. Let's see how it can be used.

use feature qw(state); use Type::Params qw(compile_named); use Types::Standard qw(FileHandle HasMethods Str); sub htmlprint { state $check = compile_named( file => FileHandle | HasMethods['printf'], text => Str, colour => Str, ); my $arg = $check->(@_); $arg->{file}->printf( '<span style="color:%s">%s</span>', $arg->{colour}, $arg->{text}, ); }

This looks a lot neater and the code is pretty self-documenting. And you can use the same type constraints you might already be using in your object attributes.

So what's going on here? $check is a super-optimized coderef for checking the function's parameters, built using the same code inlining techniques used by Moose and Moo constructors and accessors. While it runs very fast, it is kind of slow to build it, which is why we store it in a state variable. That way it only gets compiled once when the function is first called, and can then be reused for each subsequent call.

If you're stuck with Perl 5.8 so can't use state, then it's easy enough to do something similar with normal lexical variables:

use Type::Params qw(compile_named); use Types::Standard qw(FileHandle HasMethods Str); my $_check_htmlprint; sub htmlprint { $_check_htmlprint ||= compile_named( file => FileHandle | HasMethods['printf'], text => Str, colour => Str, ); my $arg = $_check_htmlprint->(@_); ...; # rest of the function goes here }

As a bonus, it actually checks more things for you than our earlier approach. In particular, it will complain if you try to pass extra unknown parameters:

# will throw an exception because of 'size' htmlprint( file => $fh, text => "Hello world", colour => "red", size + => 7 );

And it will allow you to call the function passing a hashref of parameters:

  htmlprint({ file => $fh, text => "Hello world", colour => "red" });

Since Type::Tiny 1.004000 you can also supply defaults for missing parameters:

use feature qw(state); use Type::Params 1.004000 qw(compile_named); use Types::Standard qw(FileHandle HasMethods Str); sub htmlprint { state $check = compile_named( file => FileHandle | HasMethods['printf'], text => Str, colour => Str, { default => "black" }, ); my $arg = $check->(@_); ...; # rest of the function goes here }

Protecting Against Typos Inside the Function

Recent versions of Type::Params allow you to return an object instead of a hashref from $check. To do this, use compile_named_oo instead of compile_named

use feature qw(state); use Type::Params 1.004000 qw(compile_named_oo); use Types::Standard qw(FileHandle HasMethods Str); sub htmlprint { state $check = compile_named_oo( file => FileHandle | HasMethods['printf'], text => Str, colour => Str, { default => "black" }, ); my $arg = $check->(@_); $arg->file->printf( # not $arg->{file} '<span style="color:%s">%s</span>', $arg->colour, # not $arg->{colour} $arg->text, # not $arg->{text} ); }

This will add a slight performance hit to your code (but shouldn't signiciantly impact the speed of $check) but does look a little more elegant, and will give you somewhat helpful error messages (about there being no such method as $arg->color) if you mistype a parameter name.

Shifting off $self

Now imagine our function is intended to be called as a method. We probably want to shift $self off @_ first. Just do this as normal:

use feature qw(state); use Type::Params 1.004000 qw(compile_named); use Types::Standard qw(FileHandle HasMethods Str); sub htmlprint { state $check = compile_named( file => FileHandle | HasMethods['printf'], text => Str, colour => Str, { default => "black" }, ); my $self = shift; my $arg = $check->(@_); ...; # rest of the function goes here }

It's sometimes useful to check $self is really a blessed object and not, say, the class name. (That is, check we've been called as an object method instead of a class method.)

use feature qw(state); use Type::Params 1.004000 qw(compile_named); use Types::Standard qw(FileHandle HasMethods Str Object); sub htmlprint { state $check = compile_named( file => FileHandle | HasMethods['printf'], text => Str, colour => Str, { default => "black" }, ); my $self = Object->(shift); # will die if it's not an object my $arg = $check->(@_); ...; # rest of the function goes here }

Positional Parameters

For functions with three or more parameters, it usually makes sense to use named parameters (as above), but if you want to use positional parameters, use compile instead of compile_named:

use feature qw(state); use Type::Params 1.004000 qw(compile); use Types::Standard qw(FileHandle HasMethods Str); sub htmlprint { state $check = compile( FileHandle | HasMethods['printf'], Str, Str, { default => "black" }, ); my ($file, $text, $colour) = $check->(@_); ...; # rest of the function goes here } htmlprint($fh, "Hello world", "red"); htmlprint($fh, "Hello world"); # defaults to black

Coercions

One of the most powerful features of Moose type constraints is type coercions. This allows you to automatically convert between types when a type check would otherwise fail. Let's define a coercion from a string filename to a filehandle:

package My::Types { use Type::Library -base; use Type::Utils -all; use Types::Standard (); declare "FileHandle", as Types::Standard::FileHandle; coerce "FileHandle", from Types::Standard::Str, via { open(my $fh, "<", $_) or die("Could not open $_: $!"); return $fh; }; }

Now we can use out custom FileHandle type:

use feature qw(state); use Type::Params 1.004000 qw(compile_named); use Types::Standard qw(HasMethods Str); use My::Types qw(FileHandle); sub htmlprint { state $check = compile_named( file => FileHandle | HasMethods['printf'], text => Str, colour => Str, { default => "black" }, ); my $arg = $check->(@_); ...; # rest of the function goes here }

Now this will work:

htmlprint( file => "/tmp/out.html", # will be coerced to a filehandle text => "Hello world", );

You don't need to say coerce => 1 anywhere. Coercions happen by default. If you wish to disable coercions, you can use Type::Tiny's handy no_coercions method:

use feature qw(state); use Type::Params 1.004000 qw(compile_named); use Types::Standard qw(HasMethods Str); use My::Types qw(FileHandle); sub htmlprint { state $check = compile_named( file => FileHandle->no_coercions | HasMethods['printf'], text => Str, colour => Str, { default => "black" }, ); my $arg = $check->(@_); ...; # rest of the function goes here }

The no_coercions method disables coercions for just that usage of the type constraint. (It does so by transparently creating a child type constraint without any coercions.)

Performance

All this does come at a performance cost, particularly for the first time a sub is called and $check needs to be compiled. But for a frequently called sub, Type::Params will perform favourably compared to most other solutions.

According to my own benchmarking (though if you want to be sure, do your own benchmarking which will better cover your own use cases), Type::Params performs a smidgen faster than Params::ValidationCompiler, about five times faster than Params::Validate, about ten times faster than Data::Validator, and about twenty times faster than MooseX::Params::Validate.

Short of writing your own checking code inline (and remember how long and ugly that started to look!), you're unlikely to find a faster way to check parameters for a frequently used sub.

Many of Type::Tiny's built in type checks can be accellerated by installing Type::Tiny::XS and/or Ref::Util::XS.

One very minor performance improvement... this:

  my $arg = $check->(@_);

... will run very slightly faster if you write it like this:

  my $arg = &{$check};

It's a fairly Perl-4-ish way of calling subs, but it's more efficient as Perl avoids creating a new @_ array for the called function and simply passes it the caller's @_ as-is.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: CUFP [id://1219503]
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others surveying the Monastery: (5)
As of 2024-04-19 06:52 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found