Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Re: Testing for randomness

by jonadab (Parson)
on Oct 24, 2003 at 03:40 UTC ( [id://301781]=note: print w/replies, xml ) Need Help??


in reply to Testing for randomness

One major issue with testing for randomness is the matter of sample size. You seem to be implying that randomness requires an even distribution, but this is incorrect, at least with a small-to-moderate sample size; in fact, if your module *guarantees* an even distribution, then it is absolutely not properly random. Random data will *converge* on an even distribution as the sample size increases, but the sample size often needs to be quite large before the distribution is really very even. If you roll a ballanced die a hundred times, it is quite likely that some numbers will be significantly better represented than others. If you roll it a thousand times, the distribution will seem a bit more even, but it still won't be fully even, usually, not even enough that you can be sure about the die; you need to roll the die ten thousand or a hundred thousand times to really be sure whether the die is ballanced.

500 heads followed by 500 tails is an extreme, but a ballanced coin may very well throw heads 600 out of the first thousand times. That doesn't imply it's necessarily weighted 60% toward heads; another time the same coin tossed in the same way might throw tails 600 out of a thousand times. In fact, if it threw heads *exactly* 500 out of the first thousand times, I'd suspect it was rigged or the data falsified. 537 times out of a thousand would be *MUCH* more likely. 600 is a tad bit high, but still quite possible. If you really need to test the ballance of the coin, you need to throw it another fifty thousand times or so.

Now, if you were testing a local random number generator, this wouldn't be any big deal; let it run a million times in two and a half minutes. However, since you're getting your random data from the internet, this may be an issue. As others have suggested, you may want to examine your algorithm for bias.


$;=sub{$/};@;=map{my($a,$b)=($_,$;);$;=sub{$a.$b->()}} split//,".rekcah lreP rehtona tsuJ";$\=$ ;->();print$/

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://301781]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (5)
As of 2024-04-25 14:32 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found