One major issue with testing for randomness is the matter of sample size. You seem to be implying that randomness requires an even distribution, but this is incorrect, at least with a small-to-moderate sample size; in fact, if your module *guarantees* an even distribution, then it is absolutely not properly random. Random data will *converge* on an even distribution as the sample size increases, but the sample size often needs to be quite large before the distribution is really very even. If you roll a ballanced die a hundred times, it is quite likely that some numbers will be significantly better represented than others. If you roll it a thousand times, the distribution will seem a bit more even, but it still won't be fully even, usually, not even enough that you can be sure about the die; you need to roll the die ten thousand or a hundred thousand times to really be sure whether the die is ballanced.

500 heads followed by 500 tails is an extreme, but a ballanced coin may very well throw heads 600 out of the first thousand times. That doesn't imply it's necessarily weighted 60% toward heads; another time the same coin tossed in the same way might throw tails 600 out of a thousand times. In fact, if it threw heads *exactly* 500 out of the first thousand times, I'd suspect it was rigged or the data falsified. 537 times out of a thousand would be *MUCH* more likely. 600 is a tad bit high, but still quite possible. If you really need to test the ballance of the coin, you need to throw it another fifty thousand times or so.

Now, if you were testing a local random number generator, this wouldn't be any big deal; let it run a million times in two and a half minutes. However, since you're getting your random data from the internet, this may be an issue. As others have suggested, you may want to examine your algorithm for bias.


$;=sub{$/};@;=map{my($a,$b)=($_,$;);$;=sub{$a.$b->()}} split//,".rekcah lreP rehtona tsuJ";$\=$ ;->();print$/

In reply to Re: Testing for randomness by jonadab
in thread Testing for randomness by DrHyde

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.