As mentioned in the root node update, I found a fatal flaw in the algorithm - essentially the necessary size of @queue does not scale as I expected. For some pathological cases, you need to store nearly the entire result array in order to maintain a correct result. I've created a version that outputs correctly by traversing contours of i + j = constant and caching 1/2*N*M results. Unfortunately, because the real speed benefit I was getting was from using an insertion sort on a fixed-length queue, this also kills my great performance. The code (with 1 print per sum):

sub solution_1 { # queue solution # O(2*N+M) memory, O(N^2*M) time my ($list_ref1, $list_ref2) = @_; my @list1; my @list2; if (@$list_ref1 <= @$list_ref2) { @list1 = @$list_ref1; @list2 = @$list_ref2; } else { @list1 = @$list_ref2; @list2 = @$list_ref1; } my @queue = ( $list1[-1]+$list2[-1] ); for my $k (0 .. 2*$#list1) { for my $i (0 .. $k) { next if $i >= @list1; my $j = $k - $i; last if $j >= @list2; print OUT (shift(@queue),"\n") if @queue >= 0.5*@list1*@li +st2; my $sum = $list1[$i]+$list2[$j]; my $count = 0; $count++ until $sum <= $queue[$count]; splice @queue, $count, 0, $sum; } } pop @queue; print OUT "$_\n" for @queue; }

And the benchmarks:

Benchmark: timing 100 iterations of Baseline, LR_1, LR_2, Queue... Baseline: 0.555567 wallclock secs ( 0.55 usr + 0.00 sys = 0.55 CPU +) @ 181.82/s (n=100) (warning: too few iterations for a reliable count) LR_1: 18.9476 wallclock secs (18.94 usr + 0.00 sys = 18.94 CPU) + @ 5.28/s (n=100) LR_2: 70.0044 wallclock secs (70.00 usr + 0.00 sys = 70.00 CPU) + @ 1.43/s (n=100) Queue: 132.26 wallclock secs (132.25 usr + 0.00 sys = 132.25 CPU +) @ 0.76/s (n=100) Rate Queue LR_2 LR_1 Baseline Queue 0.756/s -- -47% -86% -100% LR_2 1.43/s 89% -- -73% -99% LR_1 5.28/s 598% 270% -- -97% Baseline 182/s 23945% 12627% 3344% -- Benchmark: timing 100000 iterations of Baseline, LR_1, LR_2, Queue... Baseline: 1.61376 wallclock secs ( 1.60 usr + 0.01 sys = 1.61 CPU) + @ 62111.80/s (n=100000) LR_1: 7.19492 wallclock secs ( 7.19 usr + 0.01 sys = 7.20 CPU) + @ 13888.89/s (n=100000) LR_2: 8.1213 wallclock secs ( 8.12 usr + 0.00 sys = 8.12 CPU) +@ 12315.27/s (n=100000) Queue: 4.26218 wallclock secs ( 4.26 usr + 0.00 sys = 4.26 CPU) + @ 23474.18/s (n=100000) Rate LR_2 LR_1 Queue Baseline LR_2 12315/s -- -11% -48% -80% LR_1 13889/s 13% -- -41% -78% Queue 23474/s 91% 69% -- -62% Baseline 62112/s 404% 347% 165% --

In reply to Re^5: Challenge: Sorting Sums Of Sorted Series by kennethk
in thread Challenge: Sorting Sums Of Sorted Series by Limbic~Region

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.