Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Re^3: When every microsecond counts: Parsing subroutine parameters

by Jenda (Abbot)
on May 17, 2008 at 23:08 UTC ( [id://687154]=note: print w/replies, xml ) Need Help??


in reply to Re^2: When every microsecond counts: Parsing subroutine parameters
in thread When every microsecond counts: Parsing subroutine parameters

I don't think any optimization would help much and thus will not be implemented. I ran a few benchmarks to see how much of the additional overhead is related to the repeated creation of the hash and thus might be removed by reusing the hash:

use Benchmark qw(cmpthese); sub with_hash { my ($one, $two) = @{$_[0]}{'one', 'two'}; } sub wo_hash { my ($one, $two) = @{{@_}}{'one', 'two'}; } my %h = (one => undef, two => undef); cmpthese(1000000, { wo_hash => sub { wo_hash(one => 7, two => 9) }, with_hash => sub { with_hash({one => 7, two => 9}) }, with_consthash => sub { with_hash(\%h) }, with_consthash_mod => sub { @h{'one','two'} = (8,1); with_hash(\%h +) }, with_consthash_modd => sub { @h{'one','two'} = (8,1); with_hash(\% +h); @h{'one','two'}=() }, with_consthash_moddL => sub { local @h{'one','two'} = (8,1); with_ +hash(\%h);}, with_consthash_moddRA => sub { @h{'one','two'} = (8,1); my @r=with +_hash(\%h); @h{'one','two'}=(); @r }, with_consthash_moddRS => sub { @h{'one','two'} = (8,1); my $r=with +_hash(\%h); @h{'one','two'}=(); $r }, });
As you can see I tried to pass a completely constant hash, that looked OK, much better than foo({one => 1, two => 2}) (1002004/s vs 424628/s on my computer with Perl 5.8.8), the problem is that once I modified the values in the hash before the call the gain got much smaller (634921/s vs 424628/s). And the problem was that the values were kept in the hash between invocations ... which doesn't matter for numbers, but would matter for huge strings or for references. So I had to clear the values. undef()ing the whole hash destroyed any gain whatsoever, setting the values to undef took me to just 489237/s vs 424628/s. And that was if the called subroutine did not need to return anything!

I tried to use local() on the hash slice or assign the return value into a variable, but that just made things worse, in case the function was supposed to return a list, even worse than the normal inline hash version.

So even if perl created a hash for the subroutine just once, kept it and just modified and removed the values for each call, the speed gain would be fairly small. For a fairly high price both in memory footprint and code complexity.

The only thing that might really help would be to convert the named parameter calls into positional on compile time. The catch is that it would require that Perl knows, at the time it compiles the call, all named parameters the subroutine/method can take and the order in which they are expected while converted to positional. Which is completely out of question for methods.

I'm afraid we have to live with the overhead and in the rare case it actually matters, change the subroutine/method to use positional params.

Replies are listed 'Best First'.
Re^4: When every microsecond counts: Parsing subroutine parameters
by BrowserUk (Patriarch) on May 18, 2008 at 00:39 UTC
    I'm afraid we have to live with the overhead...

    There is another alternative. Don't use named parameters.

    Why does anyone use named parameters?

    Let's see. How many of these languages do you think use named parameters at the call site?:

    ABC ACSL Ada Alef Algol Algol68 APL AppleScript AutoIt Autolisp Awk BASIC BCPL Befunge BETA BLISS BLooP C C# C* C++ Cecil CFML CHILL Cilk CLAIRE Clean CLU CMS-2 COBOL Common Lisp Concurrent Clean Concurrent Pascal CORAL 66 CorelScript csh CSP cT Curry Dylan Dynace Eiffel Elisp Erlang Escher Esterel Euphoria FLooP FORMAC Forms/3 Forth FORTRAN FP Goedel GPSS Haskell Hope HyperTalk ICI Icon INTERCAL Interlisp J Java JavaScript Jovial Leda LIFE Limbo Lingo Lisp Logo LotusScript Lua Lucid M Magma Mathematica Mawl Mercury Miranda ML Modula 3 Modula-2 MUMPS NESL NIAL Oberon Objective-C Obliq occam OPS5 Orca Oz Pascal PerfectScript Perl PHP Pict Pike Pilot PL/C PL/I Postscript Prolog Python QBasic Quake-C REBOL Reduce Rexx RPG Ruby S Sather Scheme Self SETL sh Simscript SIMULA

    50%? 10%, 5%, 1%, 2?

    Even the much maligned VB Basic programmers seem to be able to write and maintain their code without this crutch. Why do Perl programmers suddenly feel the need for it?

    I think that some time ago, someone found that they could do it. That a combination of Perl's syntax and hashes meant that it was possible. And kinda cute. And for complex constructors with lots of possible parameters, many optional, it makes a certain amount of sense. You mostly don't call heavy constructors in tight loops so there's no great harm in using it. For constructors.

    But for most general purpose subroutines and method calls, the need for named parameters--ie. calls that take so many arguments that naming them is beneficial beyond an aid memoire for the casual tourist to the code--is strongly indicative of something seriously wrong in the design of the API.

    Mostly, it is just as hard to look up the naming and spelling and casing conventions of named parameters when writing the calls, and just as hard to interpret the meaning of those names when reading them.

    For most programmers in most languages, naming the positional arguments (formal parameters) within the sub or method is perfectly clear and effective. And Perl has that ability. And any edicts to force this upon Perl programmers is based on YAJ. (Yet another Justifiction.)


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Let's see. How many of these languages do you think use named parameters at the call site?:
      I can't really make sense of that sentence, but anyway programming is not a democracy, and it's still evolving fast enough that just counting languages won't give you meaningful insights.

      I think that some time ago, someone found that they could do it. That a combination of Perl's syntax and hashes meant that it was possible. And kinda cute. And for complex constructors with lots of possible parameters, many optional, it makes a certain amount of sense. You mostly don't call heavy constructors in tight loops so there's no great harm in using it. For constructors.
      AFAICT the big advantage of named parameters is that you can leave out the parts that default. This is great when you've got loads of options. And yes, most functions calls do not need a lot of options. But named options really do make a lot of sense whenever you've got two or more of them.

      Mostly, it is just as hard to look up the naming and spelling and casing conventions of named parameters when writing the calls,
      So what? Counting commas is no fun either. Also, a good IDE will help a lot there.
      and just as hard to interpret the meaning of those names when reading them.
      That's just bullshit.

      Yeah, there's a much better and cheaper way - don't name them, name the indices into @_ via constant subs, if you need names instead of numbers for sake of code clarity:

      sub FOO () { 0 } sub BAR () { 1 } sub routine { my $bar = $_[BAR]; $bar += munge( $_[FOO] ); }

      But it is crucial for that discussion to identify when it is beneficial to use named parameters, and why. I can think of:

      • frameworks - you write code that gets called, and there's a convention for what each call brings along. POE is a good example
      • looking up a subroutine or method - you want to make use of some subroutines you use seldom, and a quick glance should suffice to know what it needs
      • myriads of options - but mostly you need just a few of them. Tk is a good example for that

      All other reasons seem to be based on gusto. But then, in early perl OO, objects were mostly blessed hashrefs (tutorials and perl pods are full of them), and much unreflected use of named parameters stems from there, I guess.

      --shmem

      _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                    /\_¯/(q    /
      ----------------------------  \__(m.====·.(_("always off the crowd"))."·
      ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
        Yeah, there's a much better and cheaper way - don't name them, name the indices into @_ via constant subs,

        Yeah! I demonstrated the potential for that back in Micro optimisations can pay off, and needn't be a maintenance problem & despite the jury's verdict, the performance gains attributable to minimising subroutine call overhead in that type of cpu intensive, heavily iterative (3d graphics with hidden line removal) are distinctly measurable.

        I'd only use this for subs that are unavoidably called at the centre of several levels of loop. The 2D & 3D Vector classes in the code in Re: Re^2: Micro optimisations can pay off, and needn't be a maintenance problem (I don't believe it) are a perfect example of the sort of code that can benefit from this technique.

        Any particular reason for using constant subs? It achieves the same thing, but to me use constant is just clearer of the intent and saves a little typing:

        use constant { FOO => 0, BAR => 1 };

        What would be really cool is for an alternative sub declaration syntax that declared scoped constants for subs. Eg.

        sub Point3D::new( CLASS, X, Y, Z ) { return bless [ @_[ X, Y, Z ] ], $_[ CLASS ]; }

        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
        But for most general purpose subroutines and method calls, the need for named parameters--ie. calls that take so many arguments that naming them is beneficial beyond an aid memoire for the casual tourist to the code--is strongly indicative of something seriously wrong in the design of the API.

      I respectfully disagree.

      Named parameters means I don't have to pass a string of undefs because one particular call doesn't use those parameters. APIs using positional parameters have a way of requiring difficult upgrade path.

      It's also self-documenting -- instead of a list of variables, each variable is named, which can only help the future software forensic expert.

      Many years ago, I wrote a User Interface program in C, and one of the things that I used was lots of parameter passing, knowing enough that global variables were not the answer. Eventually, I had a couple of routines that required a dozen or so parameters, and as the code matured into a lovely congealed mass of spaghetti, I began to dread getting in there to fiddle with calls to that code, precisely because I had to add 'just one more' parameter at the end.

      The alternative could have been to pass in a pointer to a struct, which is more or less a hashref, but I wasn't secure enough in my abilities to do that. Too bad, because it would have been the right thing to do, just as using a hashref is the right thing to do.

      Alex / talexb / Toronto

      "Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds

        Named parameters means I don't have to pass a string of undefs because one particular call doesn't use those parameters. APIs using positional parameters have a way of requiring difficult upgrade path.

        I personally believe that named parameters can indeed be very useful. In this sense Perl 6 with its extremely complete and flexible sub signatures is fantastic. Perl 5 is also charming for the far reaching semantics it can get out of its very simple mechanism of parameter passing, allowing one to emulate named parameters.

        However, as far as your remark about "a string of undefs" (I presume you really mean "list") is concerned, I would like to point out that while the fact that several commas "collapse" into one fits perfectly well into Perl's semantics, I have occasionally desired say $x,,,,$y to be a shortcut for $x,undef,undef,undef,$y.

        --
        If you can't understand the incipit, then please check the IPB Campaign.
        Named parameters means I don't have to pass a string of undefs because one particular call doesn't use those parameters.

        I don't suppose you have any concrete examples you'd care to share?

        APIs using positional parameters have a way of requiring difficult upgrade path.

        And yet, other than tcl, I can't find reference to a single other language that has felt the need to implement named parameters?

        Don't take me wrongly. The are absolutely some calls in many APIs (from many languages) that would benefit from this kind of self documentation.

      • CreateWindow() with its 11 parameters, some of which are themselves structs or bit-fileds is an obvious candidate.
      • CreateFile() with its 7 parameters including 4 bit-fields is another.

        But by and large, most of them are constructors. And where APIs regulary require the user to supply a list of undefs in order to use the call, architypically select undef,undef,undef, 0.1; these are generally and widely acknowledged, even by their authors, as being "ones that got away".

        With most functions that sometimes require more than 3 parameters, there is a 'natural ordering' that means that any omitted parameters will come at the end. Eg. substr, splice, read. Even in a function rich API like Perl's there are suprisingly few calls that require more than 3 args, and almost none that require the use of placeholders for distinct functionality.

        And that's the clue for me. If an API (beyond constructors), cannot be designed such that any omitted arguments fall at the end, then it is really two (or more) apis that have been conflated. select is the prime example as noted above, and it isn't hard to see how to change that:

        • my $old = setStdout( $new );
          sub setStdout { my $new = shift; return select( $new ); }
        • usleep( 0.1 );
          sub usleep { my $time = shift; return select undef, undef, undef, $time; }
        • select $read, $write, $error, $time );

          Of course, IO::Select does a much better job of dealing with this form.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://687154]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others taking refuge in the Monastery: (2)
As of 2024-04-26 03:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found