Re^6: Behold! The power of recursion.

Replies are listed 'Best First'.
Re^7: Behold! The power of recursion. by ikegami (Patriarch) on Feb 17, 2006 at 23:17 UTC
Indeed! Benchmark results: `Rate uses_goto normal uses_amp uses_goto 94719/s -- -10% -14% normal 105040/s 11% -- -4% uses_amp 109823/s 16% 5% -- Rate uses_goto normal uses_amp uses_goto 94448/s -- -8% -14% normal 103141/s 9% -- -7% uses_amp 110408/s 17% 7% -- Rate uses_goto normal uses_amp uses_goto 94148/s -- -11% -14% normal 105616/s 12% -- -4% uses_amp 109985/s 17% 4% --` [download] Benchmark code: Read more... (746 Bytes)	[reply] [d/l] [select]
Re^8: Behold! The power of recursion. by diotalevi (Canon) on Feb 18, 2006 at 00:26 UTC
Manually optimize the tail call away yourself and get a very nice improvement in speed. I switched back to your node's parent's code because it was recursive and yours wasn't. I haven't decided where exactly recursion is so utterly losing `Rate normal goto amp unrolled normal 7.91/s -- -75% -91% -95% goto 31.9/s 304% -- -63% -81% amp 85.5/s 981% 168% -- -50% unrolled 169/s 2044% 431% 98% --` [download] Read more... (2 kB) ⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊	[reply] [d/l] [select]
Re^9: Behold! The power of recursion. by BrowserUk (Patriarch) on Feb 18, 2006 at 04:53 UTC
The biggest problem with the 'normal' recursive form is that it chews memory like it was an all-you-can-eat buffet. Try changing the 2000 items to 20,000 in your benchmark and listen to your harddisk go into a death spiral as your machine tries to find enough virtual ram. (Maybe higher if you've more ram than I). None of the other versions suffer this effect and that's the greatest benefit of the goto and `&rec` forms. That said, a rather simpler construction of the & form will, despite needing to copy the input list to compensate for it's destructive nature, beat the unrolled form to over 2000 items, (maybe higher if your memory allocator is more efficent than mine), and fairly substantially for smaller lists, which probably covers the vast majority of uses of `max` in the wild. I excluded the 'normal' form because of the memory problem, and the goto form because it is so slow that it makes for silly percentages: `sub _r_max { my $h = shift; return $h unless @_; $_[0] = $h if $h > $_[0]; &_r_max; } sub r_max{ _r_max( @{[ @_ ]} ) } __END__ C:\test>junk3 -N=100 Rate amp unrolled r_max amp 4755/s -- -35% -50% unrolled 7275/s 53% -- -24% r_max 9522/s 100% 31% -- C:\test>junk3 -N=2000 Rate amp unrolled r_max amp 183/s -- -50% -51% unrolled 363/s 98% -- -2% r_max 372/s 103% 2% --` [download] But of course, a simpler implementation of the iteratative form is also possible. Again, it does an initial copy of the input list to avoid destructive modifications to it. Despite this, it beats all the others quite easily upto half a million items: sub _i_max {{ my $h = shift; return $h unless @_; $_[0] = $h if $h > $_[0]; redo; }} sub i_max{ _i_max( @{[ @_ ]} ) } __END__ C:\test>junk3 -N=5e3 Rate amp r_max unrolled i_max amp 70.4/s -- -48% -52% -73% r_max 136/s 94% -- -7% -47% unrolled 147/s 110% 8% -- -43% i_max 257/s 265% 88% 74% -- C:\test>junk3 -N=5e4 Rate amp r_max unrolled i_max amp 6.79/s -- -47% -53% -72% r_max 12.7/s 88% -- -12% -47% unrolled 14.5/s 113% 14% -- -39% i_max 23.9/s 252% 87% 65% -- C:\test>junk3 -N=5e5 (warning: too few iterations for a reliable count) Rate amp r_max unrolled i_max amp 0.678/s -- -47% -53% -72% r_max 1.27/s 88% -- -12% -47% unrolled 1.44/s 112% 13% -- -39% i_max 2.38/s 251% 87% 65% -- [download] It seems to be able to achieve 65% faster than it's nearest rival even though in the last case, the copy pushed my machine into swapping other code out to acquire the required space. I was going to add a test for the size of the input list in the first level sub and switch to a non-copying alternative when the list was greater than whatever size where the copying would negate the performance it allowed, but I run out of virtual memory before that ever occurs. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal? "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply] [d/l] [select]

Indeed!

Benchmark results:

              Rate uses_goto    normal  uses_amp
uses_goto  94719/s        --      -10%      -14%
normal    105040/s       11%        --       -4%
uses_amp  109823/s       16%        5%        --

              Rate uses_goto    normal  uses_amp
uses_goto  94448/s        --       -8%      -14%
normal    103141/s        9%        --       -7%
uses_amp  110408/s       17%        7%        --

              Rate uses_goto    normal  uses_amp
uses_goto  94148/s        --      -11%      -14%
normal    105616/s       12%        --       -4%
uses_amp  109985/s       17%        4%        --
[download]

Benchmark code:

Read more... (2 kB)

⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊

[reply]
[d/l]
[select]

The biggest problem with the 'normal' recursive form is that it chews memory like it was an all-you-can-eat buffet. Try changing the 2000 items to 20,000 in your benchmark and listen to your harddisk go into a death spiral as your machine tries to find enough virtual ram. (Maybe higher if you've more ram than I).

None of the other versions suffer this effect and that's the greatest benefit of the goto and &rec forms.

That said, a rather simpler construction of the & form will, despite needing to copy the input list to compensate for it's destructive nature, beat the unrolled form to over 2000 items, (maybe higher if your memory allocator is more efficent than mine), and fairly substantially for smaller lists, which probably covers the vast majority of uses of max in the wild.

I excluded the 'normal' form because of the memory problem, and the goto form because it is so slow that it makes for silly percentages:

sub _r_max {
    my $h = shift;
    return $h unless @_;
    $_[0] = $h if $h > $_[0];
    &_r_max;
}
sub r_max{ _r_max( @{[ @_ ]} ) }

__END__
C:\test>junk3 -N=100
           Rate      amp unrolled    r_max
amp      4755/s       --     -35%     -50%
unrolled 7275/s      53%       --     -24%
r_max    9522/s     100%      31%       --

C:\test>junk3 -N=2000
          Rate      amp unrolled    r_max
amp      183/s       --     -50%     -51%
unrolled 363/s      98%       --      -2%
r_max    372/s     103%       2%       --
[download]

But of course, a simpler implementation of the iteratative form is also possible. Again, it does an initial copy of the input list to avoid destructive modifications to it. Despite this, it beats all the others quite easily upto half a million items:

sub _i_max {{
    my $h = shift;
    return $h unless @_;
    $_[0] = $h if $h > $_[0];
    redo;
}}
sub i_max{ _i_max( @{[ @_ ]} ) }
__END__
C:\test>junk3 -N=5e3
           Rate      amp    r_max unrolled    i_max
amp      70.4/s       --     -48%     -52%     -73%
r_max     136/s      94%       --      -7%     -47%
unrolled  147/s     110%       8%       --     -43%
i_max     257/s     265%      88%      74%       --

C:\test>junk3 -N=5e4
           Rate      amp    r_max unrolled    i_max
amp      6.79/s       --     -47%     -53%     -72%
r_max    12.7/s      88%       --     -12%     -47%
unrolled 14.5/s     113%      14%       --     -39%
i_max    23.9/s     252%      87%      65%       --

C:\test>junk3 -N=5e5
            (warning: too few iterations for a reliable count)

            Rate      amp    r_max unrolled    i_max
amp      0.678/s       --     -47%     -53%     -72%
r_max     1.27/s      88%       --     -12%     -47%
unrolled  1.44/s     112%      13%       --     -39%
i_max     2.38/s     251%      87%      65%       --
[download]

It seems to be able to achieve 65% faster than it's nearest rival even though in the last case, the copy pushed my machine into swapping other code out to acquire the required space. I was going to add a test for the size of the input list in the first level sub and switch to a non-copying alternative when the list was greater than whatever size where the copying would negate the performance it allowed, but I run out of virtual memory before that ever occurs.

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

[reply]
[d/l]
[select]