in reply to Re^5: Shuffling CODONS
in thread Shuffling CODONS
I will report back chi-squared results using R
It will be interesting to see the results from a known good source.
Because I think that S::CS is (fatally) flawed. To get some feel for the accuracy of the test it performs, I decided to run it on the shuffle using the known good MT PRNG and a small dataset (1..4) a good number of times to see how consistent the results S::CS were; and the answer is not just "not very", but actually just "not":
#! perl -slw use strict; use Statistics::ChiSquare qw[ chisquare ]; use Math::Random::MT; use Data::Dump qw[ pp ]; my $mt = Math::Random::MT->new(); our $N //= 1e6; our $ASIZE //= 4; our $T //= 4; sub shuffle { $a = $_ + $mt->rand( @_ - $_ ), $b = $_[$_], $_[$_] = $_[$a], $_[$a] = $b for 0 .. $#_; return @_; } my @data = ( 1 .. $ASIZE ); my @chi; for( 1 .. $T ) { my %tests; ++$tests{ join '', shuffle( @data ) } for 1 .. $N; print chisquare( values %tests ); } __END__ C:\test>chiSquareChiSquare -ASIZE=4 -N=1e4 -T=100 There's a >25% chance, and a <50% chance, that this data is random. There's a >50% chance, and a <75% chance, that this data is random. There's a >10% chance, and a <25% chance, that this data is random. There's a >10% chance, and a <25% chance, that this data is random. There's a >50% chance, and a <75% chance, that this data is random. There's a >10% chance, and a <25% chance, that this data is random. There's a >25% chance, and a <50% chance, that this data is random. There's a >75% chance, and a <90% chance, that this data is random. There's a >25% chance, and a <50% chance, that this data is random. There's a >50% chance, and a <75% chance, that this data is random. There's a >5% chance, and a <10% chance, that this data is random. There's a >75% chance, and a <90% chance, that this data is random. There's a >10% chance, and a <25% chance, that this data is random. There's a >50% chance, and a <75% chance, that this data is random. There's a >75% chance, and a <90% chance, that this data is random. There's a >75% chance, and a <90% chance, that this data is random. There's a >50% chance, and a <75% chance, that this data is random. There's a >50% chance, and a <75% chance, that this data is random. There's a >5% chance, and a <10% chance, that this data is random. There's a >95% chance, and a <99% chance, that this data is random. There's a >1% chance, and a <5% chance, that this data is random.
79 more utterly inconsistent results:
Given this is a known good algorithm using a known good PRNG, all in all, and as I said earlier, I think that is as good a definition of random as I've seen a module produce as its results.
|
|---|