in reply to How do we find statistical distribution of a given set of numbers?

Yes, there are a whole lot of tests. The simplest test of randomness is to divide the range of the generator into buckets (let us say, 10, 100, or 1000 buckets), generate a lot of random numbers and plot how many generated numbers fall into each bucket. A good random generator will have approximately the same number of hits for each bucket (but so will a simple counter). Anyway, by the shape of the curve you can usualy make a pretty good guess about which distribution (if any) is being generated.

A very common test of randomness is the Chi square test, which you can find at Statistics::ChiSquare.

Knuth (the legend in computer science circles) wrote extensively on the subject of testing (and writing) pseudo-number generators in his seminal work "The Art of computer programming" (vol 2: seminumerical algorithms). You can also find a lot of algorithms here. However, they are in FORTRAN.

  • Comment on Re: How do we find statistical distribution of a given set of numbers?

Replies are listed 'Best First'.
Re: Re: How do we find statistical distribution of a given set of numbers?
by Sameet (Beadle) on May 02, 2004 at 13:25 UTC
    Thanks, I am downloading the Statistics::ChiSquare from CPAN. Thank you for your help
    regards
    Sameet

      Be careful!

      Generating random numbers is hard. The chi-squared test is of very limited value. Numbers which fail, are likely not usefully random, but numbers which pass might be pretty useless as well.

      As with many things in life it depends what you want. If you're rolling dice in D&D, almost any RNG will do fine. If you are simulating a time series model with high dimensional data, only the very best RNG's will be any use at all.

      Testing randomness is not trivial. I wouldn't like to do it in Perl, though I'm sure it's possible. George Marsaglia has a useful page on his Diehard tester.

      Other resources include :-

      Good luck

      -- Anthony Staines