Re^2: Searching parallel arrays. (just merge)

I don't quite follow how you track which array to read from next (yet), but godamn it's fast.

The following times yours, Limbic~Region's and mine. The data is 1 .. $N (minus anything with a 9 to stop it being a single contiguous sequence), randomly distributed across 4 arrays. The only tweaks I've made are to accumulate the results into an array rather than printing them out.

c:\test>588553-buk -N=100000
1 trial of Four arrays and a total of 100000 (1.756s total)

Found 6561 runs
c:\test>588553-lr -N=100000
1 trial of Four arrays and a total of 100000 (1.164s total)

Found 6561 runs
c:\test>588553-tye -N=100000
1 trial of Four arrays and a total of 100000 (507.209ms total)

Found 6561 runs
[download]

By way of comparison, I also tested one of the 'sort them all together' solutions (grinder's, with apologies to the other contributors!) against yours, though I had to drop $N to stop it from blowing the memory:

c:\test>588553-gdr -N=10000
1 trial of Four arrays and a total of 10000 (4.491s total)

Found 729 runs
c:\test>588553-tye -N=10000
1 trial of Four arrays and a total of 10000 (55.570ms total)

Found 729 runs
[download]

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.

Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?

"Science is about questioning the status quo. Questioning authority".

In the absence of evidence, opinion is indistinguishable from prejudice.

Comment on Re^2: Searching parallel arrays. (just merge) Select or Download Code

Replies are listed 'Best First'.
Re^3: Searching parallel arrays. (just merge) by tye (Sage) on Dec 09, 2006 at 16:35 UTC
`@next` is a linked list of the lists sorted by their first remaining item. So $p is the (index of the) list to pick from (with the smallest item). `$next[$p]` is the (index of the) next list, the one with the next smallest first remaining item. `$next[$next[$p]]` is the list after that. And -1 marks the end of the linked list. After processing one item, we move that list's index in the linked list by moving down the linked list until we find the spot where its new first remaining item falls between two other lists' first remaining items. The linked list means that we don't need to shift a whole block of array elements to do this "delete and re-insert later in the list" operation. Of course, if you used a plain array, then you could use a binary search for finding the next place to insert. And, since this is Perl, the fastest solution will likely be the one that involves the fewest opcodes instead of the one with the smallest "big O" complexity. If I were writing this in C, I'd use a heap; but those are rarely the fastest solution in Perl because slow opcode dispatch makes the more complex algorithm too heavy of a speed penalty -- you'd need a huge number of lists for a heap to pay off speed-wise in Perl. - tye	[reply] [d/l] [select]

Replies are listed 'Best First'.

Re^3: Searching parallel arrays. (just merge)
by tye (Sage) on Dec 09, 2006 at 16:35 UTC

@next is a linked list of the lists sorted by their first remaining item. So $p is the (index of the) list to pick from (with the smallest item). $next[$p] is the (index of the) next list, the one with the next smallest first remaining item. $next[$next[$p]] is the list after that. And -1 marks the end of the linked list.

After processing one item, we move that list's index in the linked list by moving down the linked list until we find the spot where its new first remaining item falls between two other lists' first remaining items.

The linked list means that we don't need to shift a whole block of array elements to do this "delete and re-insert later in the list" operation.

Of course, if you used a plain array, then you could use a binary search for finding the next place to insert.

And, since this is Perl, the fastest solution will likely be the one that involves the fewest opcodes instead of the one with the smallest "big O" complexity. If I were writing this in C, I'd use a heap; but those are rarely the fastest solution in Perl because slow opcode dispatch makes the more complex algorithm too heavy of a speed penalty -- you'd need a huge number of lists for a heap to pay off speed-wise in Perl.

- tye

[reply]
[d/l]
[select]