You seem to have a basic misunderstanding.

If 0,4,1,2,3,5,9,6,7,8 has different odds of happening than 0,4,1,2,3,6,8,9,7,5 then by definition you aren't selecting a truly random permutation. It may be pretty random. It may be good enough for some purposes. But it isn't truly random.

With your algorithm there are correlations between where numbers appear. There shouldn't be.

About the bigger problem, if you explored this problem in detail, here is what you'd find. With the pure bit-string implementation you can reduce (in Perl) memory requirements by a factor of several hundred. In C you can reduce it by a factor of 30 or so. (Perl has more overhead in its data structures...) In either language the time to run goes like O(n*n). The algorithm that you have to use at each step is, At step i pick a random number j from 1 to (n-i), count off until you have j unselected bits, announce which number that was at and select that bit. The only optimization that you can use is to count from the large end if j > (n-i)/2.

Every way of improving that run time either comes at the cost of accepting a wrong answer, or using more memory. You've tried the track of making the answer wrong. An approach that uses more memory is to take the string and dividing it into blocks of 33 bytes. The first byte says how many unselected bits there are in the next 32 bytes. Then when you walk to find the j'th bit, you can just look at the first byte, the 34'th byte, the 67'th byte, etc until you narrow it down to a block of 256 bits that you can then scan. This is still O(n*n) runtime, but the performance improvement is so dramatic that it may be useable when the absolutely memory efficient version is not.

Adding memory only helps to a point. The shuffle solution that several people give is O(n) memory usage and O(n) runtime. Doing things that take more memory than that is counterproductive - the time taken to allocate and throw around memory has to be worse than what you can gain. If you're not running out of memory, that is the right solution to use. Period. (If you are running out of memory, I'd suggest implementing that one in C before doing something sophisticated in Perl.)


In reply to Re^5: Generating 0 .. N Randomly and Efficiently by tilly
in thread Generating 0 .. N Randomly and Efficiently by Limbic~Region

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.