Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re: When every microsecond counts: Parsing subroutine parameters

by samtregar (Abbot)
on May 17, 2008 at 20:32 UTC ( [id://687136]=note: print w/replies, xml ) Need Help??


in reply to When every microsecond counts: Parsing subroutine parameters

Take home: every single method here is really, really fast. 24k calls per second is not slow. You'd have to make some fairly radical mistakes in your API before that would ever matter!

If parameter parsing is really a bottleneck in your application then you're probably making too many subroutine calls! Much better than trying to tune parameter setup is to simply write methods that do work over sets of data. So instead of something like:

foreach my $data (@rows) { $processor->do_one(data => $data); }

Add support for:

$processor->do_all(rows => \@rows);

But the vast majority of applications out there are not CPU bound to begin with - they're I/O bound either talking to a network, the filesystem or a database. In any of those cases you'd be pretty unlikely to see parameter parsing show up on a profile - I use Params::Validate all the time and I've never seen it show up.

-sam

Replies are listed 'Best First'.
Re^2: When every microsecond counts: Parsing subroutine parameters
by jplindstrom (Monsignor) on May 17, 2008 at 21:58 UTC
    Well, I recall reading something about Plucene, the Perl port of Lucene, being very difficult to get performant. After optimization it was uniformly slow because of many method calls. This wasn't really a problem in Java but was a problem in Perl.

    (this is what I recall, a quick Google session didn't find me the mail or post I remember reading about this. Plucene developers would obviously know the real story here.)

    But it is interesting what we do with this issue. Named parameters is a very common idiom. It is a very good idiom, in that it leads to maintainable code.

    So, what can we do to make it perform better? Some special optimization of this case in the perl implementation? Some new syntax to support this idiom? Could it be related to the new named arguments being proposed for perl 5.12?

    /J

      I don't think any optimization would help much and thus will not be implemented. I ran a few benchmarks to see how much of the additional overhead is related to the repeated creation of the hash and thus might be removed by reusing the hash:

      use Benchmark qw(cmpthese); sub with_hash { my ($one, $two) = @{$_[0]}{'one', 'two'}; } sub wo_hash { my ($one, $two) = @{{@_}}{'one', 'two'}; } my %h = (one => undef, two => undef); cmpthese(1000000, { wo_hash => sub { wo_hash(one => 7, two => 9) }, with_hash => sub { with_hash({one => 7, two => 9}) }, with_consthash => sub { with_hash(\%h) }, with_consthash_mod => sub { @h{'one','two'} = (8,1); with_hash(\%h +) }, with_consthash_modd => sub { @h{'one','two'} = (8,1); with_hash(\% +h); @h{'one','two'}=() }, with_consthash_moddL => sub { local @h{'one','two'} = (8,1); with_ +hash(\%h);}, with_consthash_moddRA => sub { @h{'one','two'} = (8,1); my @r=with +_hash(\%h); @h{'one','two'}=(); @r }, with_consthash_moddRS => sub { @h{'one','two'} = (8,1); my $r=with +_hash(\%h); @h{'one','two'}=(); $r }, });
      As you can see I tried to pass a completely constant hash, that looked OK, much better than foo({one => 1, two => 2}) (1002004/s vs 424628/s on my computer with Perl 5.8.8), the problem is that once I modified the values in the hash before the call the gain got much smaller (634921/s vs 424628/s). And the problem was that the values were kept in the hash between invocations ... which doesn't matter for numbers, but would matter for huge strings or for references. So I had to clear the values. undef()ing the whole hash destroyed any gain whatsoever, setting the values to undef took me to just 489237/s vs 424628/s. And that was if the called subroutine did not need to return anything!

      I tried to use local() on the hash slice or assign the return value into a variable, but that just made things worse, in case the function was supposed to return a list, even worse than the normal inline hash version.

      So even if perl created a hash for the subroutine just once, kept it and just modified and removed the values for each call, the speed gain would be fairly small. For a fairly high price both in memory footprint and code complexity.

      The only thing that might really help would be to convert the named parameter calls into positional on compile time. The catch is that it would require that Perl knows, at the time it compiles the call, all named parameters the subroutine/method can take and the order in which they are expected while converted to positional. Which is completely out of question for methods.

      I'm afraid we have to live with the overhead and in the rare case it actually matters, change the subroutine/method to use positional params.

        I'm afraid we have to live with the overhead...

        There is another alternative. Don't use named parameters.

        Why does anyone use named parameters?

        Let's see. How many of these languages do you think use named parameters at the call site?:

        ABC ACSL Ada Alef Algol Algol68 APL AppleScript AutoIt Autolisp Awk BASIC BCPL Befunge BETA BLISS BLooP C C# C* C++ Cecil CFML CHILL Cilk CLAIRE Clean CLU CMS-2 COBOL Common Lisp Concurrent Clean Concurrent Pascal CORAL 66 CorelScript csh CSP cT Curry Dylan Dynace Eiffel Elisp Erlang Escher Esterel Euphoria FLooP FORMAC Forms/3 Forth FORTRAN FP Goedel GPSS Haskell Hope HyperTalk ICI Icon INTERCAL Interlisp J Java JavaScript Jovial Leda LIFE Limbo Lingo Lisp Logo LotusScript Lua Lucid M Magma Mathematica Mawl Mercury Miranda ML Modula 3 Modula-2 MUMPS NESL NIAL Oberon Objective-C Obliq occam OPS5 Orca Oz Pascal PerfectScript Perl PHP Pict Pike Pilot PL/C PL/I Postscript Prolog Python QBasic Quake-C REBOL Reduce Rexx RPG Ruby S Sather Scheme Self SETL sh Simscript SIMULA

        50%? 10%, 5%, 1%, 2?

        Even the much maligned VB Basic programmers seem to be able to write and maintain their code without this crutch. Why do Perl programmers suddenly feel the need for it?

        I think that some time ago, someone found that they could do it. That a combination of Perl's syntax and hashes meant that it was possible. And kinda cute. And for complex constructors with lots of possible parameters, many optional, it makes a certain amount of sense. You mostly don't call heavy constructors in tight loops so there's no great harm in using it. For constructors.

        But for most general purpose subroutines and method calls, the need for named parameters--ie. calls that take so many arguments that naming them is beneficial beyond an aid memoire for the casual tourist to the code--is strongly indicative of something seriously wrong in the design of the API.

        Mostly, it is just as hard to look up the naming and spelling and casing conventions of named parameters when writing the calls, and just as hard to interpret the meaning of those names when reading them.

        For most programmers in most languages, naming the positional arguments (formal parameters) within the sub or method is perfectly clear and effective. And Perl has that ability. And any edicts to force this upon Perl programmers is based on YAJ. (Yet another Justifiction.)


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
      So, what can we do to make it perform better?

      If by "it" you mean "our programs" I think the answer is simple - make fewer subroutine calls. If your program is making so many tiny do-nothing calls that parameter parsing or even just subroutine overhead is a significant factor then you're just making too many calls.

      It's a fact of life in Perl that method calls cost something (not too much, but not nothing either). That just means you need to make them count!

      -sam

      Here's a write-up of the Plucene method call issue.

      Everything in Lucene is a method, down to outstream.writeByte(). Hash-based method dispatch just isn't fast enough for a straight port.

      --
      Marvin Humphrey
      Rectangular Research ― http://www.rectangular.com
      ... because of many method calls. This wasn't really a problem in Java but was a problem in Perl.

      Java makes inline expansion optimizations (as C does with the inline keyword or in gcc somehow automatically with the -O3 flag).

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://687136]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others browsing the Monastery: (5)
As of 2024-03-29 08:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found