comment on

As mentioned in the root node update, I found a fatal flaw in the algorithm - essentially the necessary size of @queue does not scale as I expected. For some pathological cases, you need to store nearly the entire result array in order to maintain a correct result. I've created a version that outputs correctly by traversing contours of i + j = constant and caching 1/2*N*M results. Unfortunately, because the real speed benefit I was getting was from using an insertion sort on a fixed-length queue, this also kills my great performance. The code (with 1 print per sum):

sub solution_1 {
    # queue solution
    # O(2*N+M) memory, O(N^2*M) time
    my ($list_ref1, $list_ref2) = @_;
    my @list1;
    my @list2;
    if (@$list_ref1 <= @$list_ref2) {
       @list1 = @$list_ref1;
       @list2 = @$list_ref2;
    } else {
       @list1 = @$list_ref2;
       @list2 = @$list_ref1;
    }
    
    my @queue = ( $list1[-1]+$list2[-1] );
    for my $k (0 .. 2*$#list1) {
        for my $i (0 .. $k) {
            next if $i >= @list1;
            my $j = $k - $i;
            last if $j >= @list2;

            print OUT (shift(@queue),"\n") if @queue >= 0.5*@list1*@li
+st2;

            my $sum = $list1[$i]+$list2[$j];
            my $count = 0;
            $count++ until $sum <= $queue[$count];
            splice @queue, $count, 0, $sum;
        }
    }
    pop @queue;
    print OUT "$_\n" for @queue;
}
[download]

And the benchmarks:

Benchmark: timing 100 iterations of Baseline, LR_1, LR_2, Queue...
  Baseline: 0.555567 wallclock secs ( 0.55 usr +  0.00 sys =  0.55 CPU
+) @ 181.82/s (n=100)
            (warning: too few iterations for a reliable count)
      LR_1: 18.9476 wallclock secs (18.94 usr +  0.00 sys = 18.94 CPU)
+ @  5.28/s (n=100)
      LR_2: 70.0044 wallclock secs (70.00 usr +  0.00 sys = 70.00 CPU)
+ @  1.43/s (n=100)
     Queue: 132.26 wallclock secs (132.25 usr +  0.00 sys = 132.25 CPU
+) @  0.76/s (n=100)
            Rate    Queue     LR_2     LR_1 Baseline
Queue    0.756/s       --     -47%     -86%    -100%
LR_2      1.43/s      89%       --     -73%     -99%
LR_1      5.28/s     598%     270%       --     -97%
Baseline   182/s   23945%   12627%    3344%       --
Benchmark: timing 100000 iterations of Baseline, LR_1, LR_2, Queue...
  Baseline: 1.61376 wallclock secs ( 1.60 usr +  0.01 sys =  1.61 CPU)
+ @ 62111.80/s (n=100000)
      LR_1: 7.19492 wallclock secs ( 7.19 usr +  0.01 sys =  7.20 CPU)
+ @ 13888.89/s (n=100000)
      LR_2: 8.1213 wallclock secs ( 8.12 usr +  0.00 sys =  8.12 CPU) 
+@ 12315.27/s (n=100000)
     Queue: 4.26218 wallclock secs ( 4.26 usr +  0.00 sys =  4.26 CPU)
+ @ 23474.18/s (n=100000)
            Rate     LR_2     LR_1    Queue Baseline
LR_2     12315/s       --     -11%     -48%     -80%
LR_1     13889/s      13%       --     -41%     -78%
Queue    23474/s      91%      69%       --     -62%
Baseline 62112/s     404%     347%     165%       --
[download]

In reply to Re^5: Challenge: Sorting Sums Of Sorted Series by kennethk
in thread Challenge: Sorting Sums Of Sorted Series by Limbic~Region

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.