in reply to (tye)Re2: Creating an array of unique numbers (Benchmarks!)
in thread Creating an array of unique numbers

I expected much worse results, but here are the results of grabbing 900 unique numbers from a list of 1000:
Benchmark: timing 5000 iterations of test_archon, test_jcwren, test_lu +cs, test_lucsjcwren, test_lucsjcwren2... test_archon: 259 wallclock secs (242.45 usr + 0.14 sys = 242.59 CPU) test_jcwren: 213 wallclock secs (202.54 usr + 0.16 sys = 202.70 CPU) test_lucs: 285 wallclock secs (267.02 usr + 0.27 sys = 267.28 CPU) test_lucsjcwren: 268 wallclock secs (248.88 usr + 0.24 sys = 249.12 C +PU) test_lucsjcwren2: 255 wallclock secs (231.91 usr + 0.21 sys = 232.12 +CPU)
This is where jcwren's algorithm begins to outperform lucs and mine. I'm surprised mine is still about as fast as lucs', though.

Replies are listed 'Best First'.
Re (tilly) 5: Creating an array of unique numbers (Benchmarks!)
by tilly (Archbishop) on Mar 28, 2001 at 09:30 UTC
    If you are doing a fixed portion k of an array of size n, your algorithm is O(n). The constant gets worse the closer that k is to 1. Unless my back of the envelope calculation is wrong your constant is proportional to k-log(1-k) where that logarithm is the natural log. Throw in the fact that you are only selecting k*n elements that works out to your selecting things 1-(log(1-k)/k) times for every output (on average). If I put in k = 0.9 that works out to be a factor of about 3.56.

    So while if you are selecting all but a few elements, your approach is slow, if you are selecting up to a fixed proportion, it should be just fine.