Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I am writing bioinformatics code that needs to be fast (it runs overnight as is...)

I have a function that is called in an inner loop that takes several arguments; is it faster to access those arguments by something like

my ($arg1, $arg2, $arg3) = @_;

or the normal way like

my $arg1 = shift; my $arg2 = shift; my $arg3 = shift;

thanks for the help!

-Dan

Replies are listed 'Best First'.
Re: Silly question about function args
by tachyon (Chancellor) on Feb 09, 2003 at 10:36 UTC

    Passing args to a function is unlikely to be a significant bottleneck as noted extensively. To optimise code speed you need to attack the bottlenecks. To do this you need to know where they are. This are quite often not where you think they are (in my experience!).

    Devel::Dprof is the usual tool of choice for code speed profiling. Here is an example Devel::Dprof is your friend with extensive discussion.

    If you are reading large files you can get several orders of magnitude speed increase by reading in blocks of data rather than single lines. See Re: Performance Question

    cheers

    tachyon

    s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

Re: Silly question about function args
by rob_au (Abbot) on Feb 09, 2003 at 09:47 UTC
      Dunno. This one seemed a reasonable, if admittadly silly question. Their code is running, and they want to make it faster. That doesnt strike me as premature optimization. I agree though that they will need to consider much more serious measures than this, so the micro-optimization point makes sense.

      --- demerphq
      my friends call me, usually because I'm late....

Re: Silly question about function args
by BrowserUk (Patriarch) on Feb 09, 2003 at 09:44 UTC

    You pays your money and takes your choice:^)

    #! perl -slw use strict; use Benchmark qw[cmpthese]; use constant ARG1 => 0; use constant ARG2 => 1; use constant LAST => -1; sub directly{ return "$_[ARG1] - $_[ARG2]" if $_[LAST]; } sub shifty{ my ($arg1, $arg2, $arg3) = (shift, shift, shift); return " +$arg1 - $arg2" if $arg3; } sub listy{ my ($arg1, $arg2, @others) = @_; return "$arg1 - $arg2" if +$others[-1]; } sub listya{ my (@args) = @_; return "$args[1] - $args[1]" if $args[-1] +; } our @a100 = 1..100; cmpthese( -3, { directly3 => 'directly 1, "TWO", 3.0;', directly100 => 'directly @a100;', shifty3 => 'shifty 1, "TWO", 3.0;', shifty100 => 'shifty @a100;', listy3 => 'listy 1, "TWO", 3.0;', listya => 'listya 1, "TWO", 3.0', listya100 => 'listya @a100;', }); __END__ Name "main::a100" used only once: possible typo at C:\test\233863.pl l +ine 14. Benchmark: running directly100, directly3, listy3, listya, listya100, shifty100, shifty3 , each for at least 3 CPU seconds directly100: 5 wallclock secs ( 3.19 usr + 0.00 sys = 3.19 CPU) @ 2 +7542.27/s (n=87970) directly3: 1 wallclock secs ( 3.03 usr + 0.00 sys = 3.03 CPU) @ 4 +7871.13/s (n=145241) listy3: 4 wallclock secs ( 3.10 usr + 0.00 sys = 3.10 CPU) @ 1 +9624.88/s (n=60896) listya: 4 wallclock secs ( 3.04 usr + 0.00 sys = 3.04 CPU) @ 1 +9505.42/s (n=59394) listya100: 3 wallclock secs ( 3.12 usr + 0.00 sys = 3.12 CPU) @ 3 +179.52/s (n=9936) shifty100: 4 wallclock secs ( 3.04 usr + 0.00 sys = 3.04 CPU) @ 2 +1964.84/s (n=66839) shifty3: 5 wallclock secs ( 3.09 usr + 0.00 sys = 3.09 CPU) @ 3 +0639.55/s (n=94523) Rate listya100 listya listy3 shifty100 directly100 shif +ty3 directly3 listya100 3180/s -- -84% -84% -86% -88% - +90% -93% listya 19505/s 513% -- -1% -11% -29% - +36% -59% listy3 19625/s 517% 1% -- -11% -29% - +36% -59% shifty100 21965/s 591% 13% 12% -- -20% - +28% -54% directly100 27542/s 766% 41% 40% 25% -- - +10% -42% shifty3 30640/s 864% 57% 56% 39% 11% + -- -36% directly3 47871/s 1406% 145% 144% 118% 74% +56% --

    Direct access is hands down fastest, with shift second and list assignment last. Directly accessing 100 args is nearly as fast as shifting 3!;

    However, what if any discernable difference it makes will depend very much on how many times your calling the sub, how deeply the call stack grows and how much you are doing in the sub. Calling small subs with many parameters many times benefiting much more than a few calls to subs that do lots of work.


    Examine what is said, not who speaks.

    The 7th Rule of perl club is -- pearl clubs are easily damaged. Use a diamond club instead.

Re: Silly question about function args
by Paladin (Vicar) on Feb 09, 2003 at 09:09 UTC
    According to Benchmark the first form is very slightly faster, but probably not enough to make much of a difference.
    use Benchmark qw/cmpthese/; sub s1 {my ($arg1, $arg2, $arg3) = @_;}; sub s2 { my $arg1 = shift; my $arg2 = shift; my $arg3 = shift; } cmpthese(200000, { shift => ' { s2(1,2,3) }', list => ' {s1(1,2,3)}', }) Benchmark: timing 200000 iterations of list, shift... list: 3 wallclock secs ( 2.14 usr + 0.01 sys = 2.15 CPU) @ 93 +023.26/s (n=200000) shift: 2 wallclock secs ( 2.25 usr + 0.02 sys = 2.27 CPU) @ 88 +105.73/s (n=200000) Rate shift list shift 88106/s -- -5% list 93023/s 6% --
      The fastest way to pass arguments is not to copy them at all, and to access them inplace in @_. (This of course doesn't make readable code). The test case below illustrates this poorly for a simplified function.

      Update: BrowserUk's answer demonstrates this better.

      use Benchmark qw/cmpthese/; sub s1 {my ($arg1, $arg2, $arg3) = @_; $arg+$arg2+$arg3 }; sub s2 { my $arg1 = shift; my $arg2 = shift; my $arg3 = shift; $arg1+$ +arg2+$arg3} sub s3 {$_[0] + $_[1] + $_[2] } cmpthese(200000, { shift => ' { s2(1,2,3) }', list => ' {s1(1,2,3)}', none => ' {s3(1,2,3)}', }) Benchmark: timing 200000 iterations of list, none, shift... list: 4 wallclock secs ( 1.39 usr + 0.00 sys = 1.39 CPU) @ 14 +3884.89/s (n=200000) none: 1 wallclock secs ( 0.70 usr + 0.01 sys = 0.71 CPU) @ 28 +1690.14/s (n=200000) shift: 2 wallclock secs ( 1.46 usr + 0.03 sys = 1.49 CPU) @ 13 +4228.19/s (n=200000) Rate shift list none shift 134228/s -- -7% -52% list 143885/s 7% -- -49% none 281690/s 110% 96% --

      --
      integral, resident of freenode's #perl
      
        The fastest way to pass arguments is not to copy them at all

        Exactly. Inline the function entirely. Thats much faster.

        and to access them inplace in @_

        Persoanlly I would caution against this. An innocuous change of

        my $foo=shift; $foo=~s/A.//g; # to $_[0]=~s/A.//g;
        Would alter the original variable. Better to do away with the subroutine entirely. Then at least this effect is obvious and as I said you avoid the subroutine jump overhead.

        --- demerphq
        my friends call me, usually because I'm late....

Re: Silly question about function args
by blokhead (Monsignor) on Feb 09, 2003 at 09:19 UTC
    If you're worried about speed, I don't see any way that the argument passing mechanism could be your bottleneck -- if you want to trim clock cycles, start your search elsewhere. However, if you're sure argument passing is really slowing you down, then whatever sub you're calling probably shouldn't be in a sub. If it must be in a sub, consider either using lexical variables bound by a closure (no arguments "passed"), or passing references to the large objects.

    Anyway, I will attempt to answer your question. You should look at the Benchmark module comes with your Perl distribution. I ran a few test cases, and found the shift method faster for very small arguments (ints, small strings, references), and the @_ method faster for passing larger objects (strings). Try it out yourself with whatever data is most common in your app.

    blokhead

Re: Silly question about function args
by demerphq (Chancellor) on Feb 09, 2003 at 09:47 UTC
    The first is moderately faster from what I know, and Benchmark agrees.
    use Benchmark 'cmpthese'; cmpthese -1, { 'shift'=> sub { my $x=shift; my $y=shift; my $z=shift; return }, 'assign'=> sub { my ($x,$y,$z)=@_; }, }; __END__ Benchmark: running assign, shift, each for at least 1 CPU seconds... assign: 2 wallclock secs ( 1.23 usr) @ 505470.78/s (n=622740) shift: 3 wallclock secs ( 1.01 usr) @ 435373.52/s (n=440598) Rate shift assign shift 435374/s -- -14% assign 505471/s 16% --
    But when I add some even rudimentary code to the subs, like
    for (1..100) { ++$x; $x+=$y+=$z }
    the difference gets swamped to the point of being noise.
    Benchmark: running assign, shift, each for at least 1 CPU seconds... assign: 2 wallclock secs ( 1.08 usr ) @ 7097.97/s (n=7680) shift: 1 wallclock secs ( 1.02 usr ) @ 6973.56/s (n=7120) Rate shift assign shift 6974/s -- -2% assign 7098/s 2% -- Benchmark: running assign, shift, each for at least 1 CPU seconds... assign: 2 wallclock secs ( 1.01 usr ) @ 7587.94/s (n=7679) shift: 2 wallclock secs ( 1.08 usr ) @ 7592.04/s (n=8207) Rate assign shift assign 7588/s -- -0% shift 7592/s 0% --
    Go ahead and make the change, but Id be quite suprised if benchmark showed much difference.

    You may need to make the function inline to see any real gains. Other tricks like using static variables (reduces allocation overhead but renders the code not thread safe). You may need to consider a lot of things. Poorly constructed regexes can chew up a lot of time. Without seeing the code and its usage theres no way anyone here could make any specific recommendations. :-)

    You should have a look at When perl is not quite fast enough

    HTH

    --- demerphq
    my friends call me, usually because I'm late....

Re: Silly question about function args
by Coruscate (Sexton) on Feb 09, 2003 at 10:13 UTC

    When it comes to receiving your arguments, I wouldn't focus on speed as much as I would on appearance. If you are passing one or two arguments, feel free to shift them off. Myself, I generally shift for only one or two args, any more I use the @_ method, not for speed, but for shorter, cleaner, uncluttered code. Just to exagerate my point, which snippet looks nicer?

    # Example 1 sub my_sub { my $name = shift; my $mail = shift; my $city = shift; print "$name, from $city, has e-mail address $mail."; } my_sub('Coruscate', 'Red Spot', 'Jupiter'); # Example 2 sub my_sub { my ($name, $mail, $city) = @_; print "$name, from $city, has e-mail address $mail."; } my_sub('Coruscate', 'Red Spot', 'Jupiter'); # Example 3 # Once I hit 4+ arguments, I hit named arguments! # (Yes, I know the example only has 3 args...) sub my_sub { my %q = @_; print "$q{name}, from $q{city}, has e-mail address $q{mail}."; } my_sub( name => 'Coruscate', mail => 'Red Spot', city => 'Jupiter' );

    Update: Another reason I really like named arguments is code that looks like the following:

    sub my_sub { my %q = ( title => 'Untitled', author => 'Anonymous', values => [1,1,1,1,1], values2 => { key1 => 'value1', key2 => 'value2' }, @_ ); print "$q{title} (by $q{author}):\n", "\t- ", join(', ', @{$q{values}}), "\n", "\t- ", join(', ', values %{$q{values2}}), "\n"; } my_sub( title => 'My Title', values2 => { key1 => 'yippee!', key2 => 'booooo!' } );


    If the above content is missing any vital points or you feel that any of the information is misleading, incorrect or irrelevant, please feel free to downvote the post. At the same time, reply to this node or /msg me to tell me what is wrong with the post, so that I may update the node to the best of my ability. If you do not inform me as to why the post deserved a downvote, your vote does not have any significance and will be disregarded.

Re: Silly question about function args
by crenz (Priest) on Feb 09, 2003 at 18:14 UTC

    You might want to take a look at Inline::C on CPAN. It allows you to embed C source directly in your perl script. I suppose that calling a C function this way takes longer than calling a perl subroutine, so you should probably implement a whole loop in C and call that function only a few times.

Re: Silly question about function args
by pg (Canon) on Feb 09, 2003 at 18:16 UTC
    The first form would be slightly faster.

    If performance is so important to your application, and you even start to look at this kind of subtle place, then why not go c, instead of perl.

    You said it would run overnight, is it active all the night? or idle most of the time? if it is active the whole night, go c.

    Choose the right tool for the right thing.

Re: Silly question about function args
by tadman (Prior) on Feb 09, 2003 at 13:19 UTC
    I'd say use the list assignment, because if your "normal" way of declaring arguments is to use shift, you're wasting a lot of effort. @_ is more concise, and less likely to be mis-typed.

    I think it's appropriate to save shift for those special cases where the modified @_ is going to be used, something which is certainly not most of the time.