in reply to Re^3: Longest Increasing Subset
in thread Longest Increasing Subset

I'm suspicious that this problem is of the order O(2**N).

I suspect it is more like O(N!).

I also suspect that whilst inspecting the longest possible combinations for monotonicity, then in order of decreasing length so that you can short-circuit the full search space, sounds like an optimisation; in reality it will end up inspecting many more possibilities than a brute-force forward search with short-circuiting of the inner iterations.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^5: Longest Increasing Subset
by Dallaylaen (Chaplain) on Oct 24, 2016 at 08:49 UTC
    I believe this can be reduced to LCS which is O(much better). Diff'ing the original data with its ordered copy would yield a sequence that (1) belongs to the original sequence, (2) is ordered and (3) is among the longest such sequences. (Answered below with code example).
      I believe this can be reduced to LCS which is O(much better).

      I concur. My assessment was of the OPs somewhat naive approach to the problem, not the optimal solution.

      reduced to LCS

      Careful. LCS can mean either: longest common substring; or -- as is applicable here -- longest common subsequence.

      And actually, the OPs problem can be further specialised to Longest increasing subsequence, and tackled with a very straight forward algorithm that is O(n log n).


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
      In the absence of evidence, opinion is indistinguishable from prejudice.

        wow... so I implemented the wiki page's pseudocode, and benchmarked my code vs the Longest Increasing Subsequence:

        results:

        20 [953 317 563 582 789 629 613 918 9 680 276 163 119 915 967 426 820 +227 256 650] lis: [ 317 563 582 613 680 915 967 ] pryrt: [ 317 563 582 629 680 915 967 ] Benchmark: running lis, pryrt for at least 10 CPU seconds ... lis: 10 wallclock secs (10.51 usr + 0.00 sys = 10.51 CPU) @ 25 +210.27/s (n=265086) pryrt: 11 wallclock secs (11.26 usr + 0.00 sys = 11.26 CPU) @ + 0.18/s (n=2) (warning: too few iterations for a reliable count) s/iter pryrt lis pryrt 5.63 -- -100% lis 3.97e-005 14197064% --

        With a random list just twenty elements long, it went from 5-6s per run for my code, to better than 25_000 runs per second with the efficient algorithm: that's almost 150_000 times faster, and that's just with 20 elements in the array. (My original benchmark tried cmpthese() on random length arrays, using the defaults to my genarray() function, but after a minute, I didn't feel like waiting that long, so switched to a 10sec timethese()/cmpthese() pair. Multiple runs give similar results.)

        When run-time is money, it pays to search for known-good algorithms, rather than just trusting that you've thought of an efficient algorithm!

        Thanks ++gilthoniel for an interesting thread, and ++BrowserUK, for once again bringing his knowledge of efficient coding practices to help us all.

        To all who come upon this thread: definitely use the efficient code, rather than my original code!