in reply to Re: Generating 0 .. N Randomly and Efficiently
in thread Generating 0 .. N Randomly and Efficiently

bart,
I'd just generate an array containing the numbers 0 .. N, and shuffle it.

As would I if I wasn't trying to learn new things by intentionally trying to avoid simple/easy solutions.


The problem with your approach is that not only the chances of misses rises as you have selected more items — it is not even garanteed to ever finish

In my approach, I minimized the hit for making a miss by just selecting the nearest unchosen bit. The direction to look for that bit is random. Additionally, if there are none available, it randomly selects an endpoint which is guaranteed to be unchosen. It is guaranteed to finish.

Cheers - L~R

  • Comment on Re^2: Generating 0 .. N Randomly and Efficiently

Replies are listed 'Best First'.
Re^3: Generating 0 .. N Randomly and Efficiently
by bart (Canon) on Oct 19, 2004 at 18:40 UTC
    I wonder... if your aim is to learn something, then why is your approach so much like most newbies' approaches to shuffling arrays? I mean, picking an item, and if it's already been picked, try again... Just see Google Groups for examples on the newsgroups. That just seems like a step backward to me...

    Anyway, though you do take care to make sure every loop finds at least one item, you will not get a nice even distribution. Let me give an example.

    Suppose that your $N is 20, and by some weird coincidence, the numbers 1 .. 10 have already been picked, and nothing else. For your next pick, the numbers 12 to 20 all have 1 chance in 21 of being picked. So do the numbers 0 and 11. But if the drawn number is between 1 and 10 (about 10 chances in 21, that is almost 50%), your system will promote it to either 0 or 11...

    Net result: there's 12 chances in 21, that is over 50%, that the next number picked, will be either 0 or 11.

    So yes, this way, you'll tend to get clustered results, instead of a nicely spread, speckled, outcome.

      bart,
      I mean, picking an item, and if it's already been picked, try again

      That is not what this code does. It automatically defaults to finding the nearest unselected bit. It does this using a regex to find the first byte that is not all ones and then examining that byte bit by bit.

      Anyway, though you do take care to make sure every loop finds at least one item, you will not get a nice even distribution.

      While I didn't say statistically uniform distribution was a necessity, it is a nicetey.

      Suppose that your $N is 20, and by some weird coincidence, the numbers 1 .. 10 have already been picked, and nothing else. For your next pick, the numbers 12 to 20 all have 1 chance in 21 of being picked.

      This is just not true. The random number is not selected over the entire range but only between 2 floating end points. Those end points move everytime they are selected.

      Net result: there's 12 chances in 21, that is over 50%, that the next number picked, will be either 0 or 11.

      This is also not true. Given the way the code works in your example, all unchosen bits have the same probability of being chosen.

      As I have indicated in other replies, uneven statistical distribution is still possible even though:

      • The end points automatically change
      • The initial guess between the end points is random
      • The direction to look after a miss is random
      This is because as holes get larger, the probability one of the two numbers on either side will be chosen increases. Theoretically those holes should all grow at the same rate, but I haven't taken the time to analyze it.

      Cheers - L~R