And you are still missing my point: asymptotic analysis is all very well and good, but if you want to optimize your code for speed, you have to pay attention to the constant multipliers.

Suppose I use exact arithmetic:

use Benchmark qw(:all); use Math::BigInt; sub recurse { #divide and conquer unshift @_, 1 if 2 != @_; my ($m, $n) = @_; if ($m < $n) { my $k = int($m/2 + $n/2); return recurse($m, $k) * recurse($k+1, $n); } else { return new Math::BigInt->new($m); } } sub iterate { my $result = Math::BigInt->new(1); $result *= $_ for 2..shift; return $result } sub direct { my $result = Math::BigInt->new(shift); return $result->bfac(); } cmpthese(1000, { 'recurse' => sub { recurse( 100) }, 'iterate' => sub { iterate( 100) }, 'direct' => sub { direct( 100) }, });
yields
Benchmark: timing 1000 iterations of direct, iterate, recurse... direct: 1 wallclock secs ( 1.68 usr + 0.00 sys = 1.68 CPU) @ 59 +5.24/s (n=1000) iterate: 10 wallclock secs ( 9.57 usr + 0.00 sys = 9.57 CPU) @ 10 +4.49/s (n=1000) recurse: 9 wallclock secs ( 9.54 usr + 0.00 sys = 9.54 CPU) @ 10 +4.82/s (n=1000) Rate iterate recurse direct iterate 104/s -- -0% -82% recurse 105/s 0% -- -82% direct 595/s 470% 468% --
So the crossover between iterative and recursive happens at about 100! Of course, the bfac method is by far the best, as it eliminates all those function calls.

If I bend the rules of the OP by using Math::GMP, the crossover doesn't happen until 25_000!

use Benchmark qw(:all); use Math::GMP; sub recurse { #divide and conquer unshift @_, 1 if 2 != @_; my ($m, $n) = @_; if ($m < $n) { my $k = int($m/2 + $n/2); return recurse($m, $k) * recurse($k+1, $n); } else { return new Math::GMP $m; } } sub iterate { my $result = new Math::GMP 1; $result *= $_ for 2..shift; return $result } sub direct { my $result = new Math::GMP shift; return $result->bfac(); } cmpthese(10, { 'recurse' => sub { recurse( 25_000) }, 'iterate' => sub { iterate( 25_000) }, 'direct' => sub { direct( 25_000) }, });
yields
Benchmark: timing 10 iterations of direct, iterate, recurse... direct: 0 wallclock secs ( 0.24 usr + 0.01 sys = 0.25 CPU) @ 40 +.00/s (n=10) (warning: too few iterations for a reliable count) iterate: 8 wallclock secs ( 7.81 usr + 0.00 sys = 7.81 CPU) @ 1 +.28/s (n=10) recurse: 8 wallclock secs ( 7.73 usr + 0.00 sys = 7.73 CPU) @ 1 +.29/s (n=10) Rate iterate recurse direct iterate 1.28/s -- -1% -97% recurse 1.29/s 1% -- -97% direct 40.0/s 3024% 2992% --
There is an important lesson here: The algorithm with the best asymptotic behavior isn't always faster. If speed is important, decide the parameter domain in which the algorithms are to be used, and benchmark the possibilities.

-Mark


In reply to Re: Re: Re: Re: Re: What is the fastest pure-Perl implementation of XXX? by kvale
in thread What is the fastest pure-Perl implementation of XXX? by dragonchild

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.