this exhibits a pretty strong bias toward the lower numbers generated, and against the higher numbers.

Yes. A side-effect of my lazy way of attempting to ensure that at least 20 numbers are produced each time.

With 400 inputs to choose from the selector value should be 0.05 not 0.075; but the nature of random is that whilst 0.05 produces a fair pick:

[undef, 999508, 999959, 1000278, 1002083, 999969, 1001388, 1002007, 99 +9127, 1000314, 1001289, 1000014, 999255, 1000929, 1001682, 1000862, 9 +98954, 1002277, 999569, 1000337, 999569]

It will on occasion produce as many as 44 values or as few as 3:

[undef, undef, undef, 2, 7, 33, 157, 398, 1041, 2437, 5333, 9541, 1650 +8, 25675, 37569, 50824, 64506, 76623, 85656, 90711, 91374, 86560, 78584, 68077, 56808, 44695, 33849, 24855, 17439, 11775, 7601, +4708, 2839, 1690, 981, 568, 290, 137, 74, 41, 12, 9, 11, 1, 1], )

By raising the selector value to 0.075, I made it far more likely that it would produce at least 20 values. The list slice ensures that it is not more than 20; but also introduces the bias by always throwing away the higher values when it overproduces.

The following results from the above code with the only change 0.05 => 0.075; demonstrates that the distribution of the range is still very fair; but on average 50% more numbers are produced each time before the slice operation trims the numbers back, (and introduces the bias). It also shows that the probability of under-producing is greatly lessened:

[undef, 1495533, 1499609, 1498974, 1499522, 1501930, 1501314, 1499981, + 1501222, 1499646, 1500600, 1500068, 1500915, 1498017, 1500384, 15010 +31, 1498257, 1500431, 1501058, 1498359, 1500716] [undef, undef, undef, undef, undef, undef, undef, undef, undef, 2, 12, + 31, 75, 170, 373, 768, 1568, 2718, 4802, 7795, 12096, 17806, 24879, +32813, 41477, 51438, 59539, 66763, 72668, 74986, 75575, 73039, 68515, 61653, 53917, 46112, 37751, 30042, 23138, +17536, 12842, 9196, 6284, 4298, 2803, 1748, 1053, 721, 429, 253, 156, + 80, 43, 17, 8, 8, 2, undef, 1, undef, 1],

This could be fixed by repeating the process until exactly 20 numbers come out, which ensure the fairness:

#! perl -slw use strict; use Data::Dump qw[ pp ]; $Data::Dump::WIDTH = 300; my( @counts, @ns ); for( 1 .. 1e6 ) { my @orderedRands = grep{ rand(1) < 0.05 } map{ ($_) x 20 } 1 .. 20 +; while( @orderedRands != 20 ) { @orderedRands = grep{ rand(1) < 0.05 } map{ ($_) x 20 } 1 .. 2 +0; } ++$ns[ @orderedRands ]; ++$counts[ $_ ] for @orderedRands; } pp \@counts, \@ns; __END__ C:\test>junk62 ( [undef, 1000652, 999987, 1000022, 999969, 999146, 1000961, 1000568, +1000129, 1000725, 999884, 999509, 999756, 1000538, 999763, 1000708, 1 +000826, 999799, 998778, 998714, 999566], [undef, undef, undef, undef, undef, undef, undef, undef, undef, unde +f, undef, undef, undef, undef, undef, undef, undef, undef, undef, und +ef, 1000000], )

Of course that is far more expensive than doing the sort that it avoids.

But then, my post was nothing more than a semi-humorous response to a question that itself is something of a joke.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
In the absence of evidence, opinion is indistinguishable from prejudice.

In reply to Re^3: random #s by BrowserUk
in thread random #s by cboPerl

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.