comment on

Here is something for people who are trying to optimize their code for performance without necessarily going all the way to inlining all your performance critical sections.

Parameter hashes are really expensive.

If you wander around the CPAN, you will notice that almost everyone loves parameter hashes. There are good reasons for that. It is definitely easier to read, maintain and extend something like

Search::InvertedIndex::Update->new({
                     -group => $group,
                     -index => $index,
                      -data => $index_data,
                      -keys => {
                           $key0 => 10,
                           $key1 => 20,
                           $key2 => 15,
                       },
});
[download]

Than say

Search::InvertedIndex::Update->new($group, $index, $index_data, {
                           $key0 => 10,
                           $key1 => 20,
                           $key2 => 15,
                       }
);
[download]

Especially if you have optional parameters or many parameters. But what are you trading for that clarity and those flexible interfaces?

Runtime speed. And more than you might think.

While putting the final touches on a parameter hash parsing module (Acme::Sub::Parms) I just released after two years of considering whether or not I should release at all (and finally put in Acme because of its use of source filtering), I had reason to benchmark the different ways of parsing parameter hashes. And the performance losses can be stunning.

Consider a simple function taking two parameters that are assigned to two lexical variables without validation. Here are some of the ways that it could be implemented:

You could use Params::Validate (with validation turned off to speed it up)

use Params::Validate  qw (validate);
$Params::Validate::NO_VALIDATION = 1;
sub params_validate {
    my ($handle, $thing) = @{(validate(@_, { handle => 1, thing => 1 }
+))}{'handle','thing'};
}
[download]

or you could use Class::ParmList

use Class::ParmList qw (simple_parms);
sub simple_parms_args {
    my ($handle, $thing) =  simple_parms(['handle','thing'], @_);
}
[download]

or my new Acme ('Acme, when you probably shouldn't have, but couldn't help yourself') module, Acme::Sub::Parms (again with validation turned off)

use Acme::Sub::Parms qw(:no_validation);

sub sub_parms_bindparms {
    BindParms : (
        my $handle : handle;
        my $thing  : thing;
    )
}
[download]

Or you could use hand rolled code:

sub one_step_args {
    my ($handle, $thing) =  @{{@_}}{'handle','thing'};
}
[download]

A little less obscurely you could use:

sub std_args {
    my %args = @_;
    my ($handle, $thing) =  @args{'handle','thing'};
}
[download]

If you wanted to be fancy, the hand rolled code could even be case-insensitive:

sub caseflat_std_args {
    my %args;
    {
        my %raw_args = @_;
        %args = map { lc($_) => $raw_args{$_} } keys %raw_args;
    }

    my ($handle, $thing) =  @args{'handle','thing'};
}
[download]

Finally, you could abandon the parameter hash for simple positional parameters:

sub positional_args {
    my ($handle, $thing) =  @_;
}
[download]

Rolling these up into a Benchmark script you get

#!/usr/bin/perl

use strict;
use warnings;

use Acme::Sub::Parms qw(:no_validation);
use Class::ParmList qw (simple_parms);
use Params::Validate qw (validate);
use Benchmark qw(cmpthese);
$Params::Validate::NO_VALIDATION = 1;
cmpthese(1000000, {
   'bindparms'     => sub { sub_parms_bindparms( handle => 'Test', 'th
+ing' => 'something')},
   'std_args'      => sub { std_args( handle => 'Test', 'thing' => 'so
+mething')},
   'caseflat'      => sub { caseflat_std_args( handle => 'Test', 'thin
+g' => 'something')},
   'one_step'      => sub { one_step_args( handle => 'Test', 'thing' =
+> 'something')},
   'postnl_args'   => sub { positional_args( 'Test', 'something')},
   'simple_parms'  => sub { simple_parms_args( handle => 'Test', 'thin
+g' => 'something')},
   'validate'      => sub { params_validate( handle => 'Test', 'thing'
+ => 'something')},
        }
);
exit;

######################################################################
+######

sub params_validate {
    my ($handle, $thing) = @{(validate(@_, { handle => 1, thing => 1 }
+))}{'handle','thing'};
}

sub sub_parms_bindparms {
    BindParms : (
        my $handle : handle;
        my $thing  : thing;
    )
}

sub simple_parms_args {
    my ($handle, $thing) =  simple_parms(['handle','thing'], @_);
}

sub positional_args {
    my ($handle, $thing) =  @_;
}

sub one_step_args {
    my ($handle, $thing) =  @{{@_}}{'handle','thing'};
}

sub caseflat_std_args {
    my %args;
    {
        my %raw_args = @_;
        %args = map { lc($_) => $raw_args{$_} } keys %raw_args;
    }

    my ($handle, $thing) =  @args{'handle','thing'};
}

sub std_args { 
    my %args = @_;
    my ($handle, $thing) =  @args{'handle','thing'};
}
[download]

And the resulting numbers?

 
                 Rate validate simple_parms caseflat bindparms one_ste
+p std_args postnl_args
validate      24851/s       --         -40%     -74%      -90%     -90
+%     -92%        -97%
simple_parms  41203/s      66%           --     -57%      -83%     -84
+%     -86%        -96%
caseflat      95969/s     286%         133%       --      -62%     -62
+%     -68%        -90%
bindparms    249377/s     903%         505%     160%        --      -1
+%     -16%        -73%
one_step     251889/s     914%         511%     162%        1%       -
+-     -15%        -73%
std_args     296736/s    1094%         620%     209%       19%      18
+%       --        -68%
postnl_args  925926/s    3626%        2147%     865%      271%     268
+%     212%          --
[download]

Ouch. 'validate' is probably the most popular parameter parser. And it is molasses in winter slow. 'simple_parms' is faster, but only in relative terms. 'BindParms' is sort of ok, as are the hand coded parsers for parameter hashes (with the exception of the case flattening one), but the winner by several horse lengths is the positional parameters. It is 36 times faster than 'validate and 3.5 times faster than the fastest of the hand code parameter hash parsers.

Lesson: When performance is on the line for a subroutine, use positional parameters NOT parameters hashes.

I had this lesson driven home while writing Search::InvertedIndex when I ran the code through DProf. Changing just two performance critical subroutines from parameter hashes parsed using 'simple_parms' to positional parameters roughly tripled the performance of the entire module while indexing.

And that is the other lesson: Use code profilers to identify your performance bottlenecks. You are likely to be surprised by where you are losing most of your cycles.

Update: Since the question of the cost of the sub call itself was raised, I'm appending a benchmark for testing the sub calls impact.

#!/usr/bin/perl

use strict;
use warnings;

my @parms = ('handle','thing');
@_     = ('handle','thing');
use Benchmark qw(cmpthese);
cmpthese(2000000, {
   'one_sub'    => \&one_sub,
   'anon_sub'   => sub { my ($handle, $thing) = @_; },
   'double_sub' => sub { one_sub(); },
   'parm2_sub'  => sub { double_sub('handle','thing'); },
   'std_args'   => sub { std_args('handle','thing'); },
   'std_args_d' => sub { std_args('handle','thing'); },
        }
);
exit;

######################################################################
+######

sub one_sub { 
        my ($handle, $thing) = @_;
}
sub d_sub { "1"; }
sub double_sub {
        d_sub(@_);
        my ($handle, $thing) = @_;
}
sub std_args_d {
    my %args = @_;
    d_sub(@_);
    my ($handle, $thing) =  @args{'handle','thing'};
}

sub std_args {
    my %args = @_;
    my ($handle, $thing) =  @args{'handle','thing'};
}
[download]

And the results:

 
                Rate std_args_d std_args parm2_sub double_sub anon_sub
+   one_sub
std_args_d  392157/s         --      -0%      -34%       -65%     -84%
+      -85%
std_args    393701/s         0%       --      -33%       -64%     -84%
+      -85%
parm2_sub   589971/s        50%      50%        --       -47%     -76%
+      -78%
double_sub 1104972/s       182%     181%       87%         --     -55%
+      -59%
anon_sub   2469136/s       530%     527%      319%       123%       --
+       -9%
one_sub    2702703/s       589%     586%      358%       145%       9%
+        --
[download]

As you can see, the sub call itself doesn't matter that much. The overhead of decoding parameter hashes is much larger than the overhead of the sub call. And the overhead of the sub call is only somewhat larger (perhaps 1.5 to 1.7 times) than the size of the overhead of decoding the positional parameters alone.

Update2: Put some 'readmore' sections around the benchmarking code chunks.

In reply to When every microsecond counts: Parsing subroutine parameters by snowhare

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.