ysth has asked for the wisdom of the Perl Monks concerning the following question:

Reading the clarified question in random negative toggle function reminded me of an question that's been bothering me for a while. Say I have an array with N elements, which I want to randomly set to either 0 or 1 but have a total of M (<=N). I know I can loop M times finding a random 0-element to switch to a 1 element (or if M > N/2, initializing all to 1 and setting N-M elements to 0), but is there a way to do it sequentially and yet insure a random distribution?

If I just loop through, giving each element a chance to be set of (how many ones left to assign) over (how many elements are left), it seems to me that that will distort the distribution toward the end.

Update: Many thanks to all who replied. What I was looking for was a random distribution, not an even distribution, and with a total of M, not an total averaging M, determined sequentially. Abigail-II's answer fits. Using the List::Util::shuffle to prepare a sequence and pulling elements off of it as I go through the real data also would fit the bill.

Replies are listed 'Best First'.
Re: random elements with fixed totals
by BrowserUk (Patriarch) on Nov 16, 2003 at 03:07 UTC

    No need to loop. Set up the list with the appropriate numbers of 1s and 0s and then shuffle it.

    use List::Util qw[shuffle]; $m = 65; $n = 100; @list = ( (1) x $m, (0) x ( $n - $m ) ); @list = shuffle @list;

    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
    Hooray!
    Wanted!

Re: random elements with fixed totals
by Abigail-II (Bishop) on Nov 16, 2003 at 03:12 UTC
    I think this is in Knuth, but I'm too lazy to look it up. Assuming that $N and $M are set:
    my @a = map {my $e = rand $N -- < $M; $e && $M --; $e || 0} 1 .. $N;

    Abigail

      I looked in Knuth before asking and didn't see it.

      But that is just the algorithm I am worried will distort the distribution at the end! For instance, if M is one, the probability of the first element being set is 1/N, but the probability of the last element being set is ...

      (N-1)!/N!     Oh. (Whacks self on head.)

      Knuth 3.4.2 Random Sampling and Shuffling, Algorithm S
      Thanks, Abigail

Re: random elements with fixed totals
by Zaxo (Archbishop) on Nov 16, 2003 at 05:26 UTC

    The requirement that a set number of bits be set is a restriction which reduces randomness. In physics, that is like the contraction of a state in statistical mechanics from any configuration to one with a certain energy. It's one thing if you want a random distribution with fixed average; another if you want a totally random distribution whose average is usually near a certain value.

    The answer to your question depends on what you really want to do.

    After Compline,
    Zaxo

Re: random elements with fixed totals
by jonadab (Parson) on Nov 16, 2003 at 12:59 UTC
    If I just loop through, giving each element a chance to be set of (how many ones left to assign) over (how many elements are left), it seems to me that that will distort the distribution toward the end.

    Yes, it may, but if so, it will distort the distribution in a random way. That is to say, one time it may run low on 1s early, distorting the later part of the distribution toward 0, and another time it may have extra 1s left for the later part of the distribution. This is a random distribution.

    You need to understand the difference between a random distribution and an even distribution. If you write your algorithm to guarantee an even distribution, it's not random; if you write it to be random, you won't always get an even distribution. There's no way to guarantee both, because those are contradictory requirements.

    If this troubles you, you can take your random distribution and randomly shuffle it, as the other monk suggested, but that still won't force it to be even.

    If what you want is a mostly even distribution that's not fully predictable, you may want to take an even distribution and do localized shuffling (i.e., loop through once or twice and randomly swap some elements with other _nearby_ elements). The distribution you get from this will not *be* random, but it may appear random to the untrained eye, and it will be close to even.


    $;=sub{$/};@;=map{my($a,$b)=($_,$;);$;=sub{$a.$b->()}} split//,".rekcah lreP rehtona tsuJ";$\=$ ;->();print$/
Re: random elements with fixed totals
by pg (Canon) on Nov 16, 2003 at 04:05 UTC
    "I know I can loop M times finding a random 0-element to switch to a 1 element"

    A potential complexity you didn't realize (or maybe you realized but didn't mention) is that, loop M times, does not gurantee you M different positions, as the random number returned could be duplicates (Of course, you can adjust it, if it is duplicated, but that demages the level of randomization). BrowserUK's solution is preferred.

      Thanks. Trying to be terse, I reduced needing-to-check-for-duplicates to the "0-" part of that sentence.
Re: random elements with fixed totals
by CountZero (Bishop) on Nov 16, 2003 at 21:59 UTC
    All depends on how you define "random". A sequence of M 1's followed by N-M 0's is as random as any other sequence, as they will have the exact same probability of being generated by a truly random random-number generator.

    I remember Knuth goes quite detailed into this random-sequence thing in Volume 2.

    My feeling is that it is more about "predictability": once a random sequence is generated, the randomness is gone and all you have is a sequence. The thing is that you should have no way to predict what sequence will be generated. If you put additional constraints (such as x 1's) then as soon as you have seen x 1's being generated, you know for sure that from then on only 0's will follow.

    CountZero

    "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law