in reply to Rand Wackiness!

There's no new solution in this post. I just felt the thread could use some clarity. For starters, there are two problems at play here.

log2(10**12) is slightly less than 40. That means your random number generator must be capable of producing 40 bit random numbers. According to BrowserUK, Active State's generator produces 15 bit numbers.

It also means you need to be capable of receiving 40bit numbers from the random number generator. perl's floats have sufficient precision to hold a 40 bit number, so you'll have to find a random number generator that returns 40+ bit floats. Alternatively, you can increase the precison of your integers (by using Math::BigInt, for example) and using a random number generator capable of returning numbers in that format (like Zaxo's solution, for example).

Replies are listed 'Best First'.
Re^2: Rand Wackiness!
by ysth (Canon) on May 12, 2006 at 10:22 UTC
    According to BrowserUK, Active State's generator produces 15 bit numbers.
    As you can verify for yourself, with
    use Config; print $Config{randbits};

      How annoying that must be: perl on my Debian systems reports 48 bits.

      print substr("Just another Perl hacker", 0, -2);
      - apotheon
      CopyWrite Chad Perrin

        Math::Random::MT::Auto can return 64-bit random integers (on 64-bit Perls). The randbits value of 48 reflects the fact that the floating-point values returned by rand() can have at most 48 significant bits (the other bits being used for the sign and exponent).

        Remember: There's always one more bug.
Re^2: Rand Wackiness!
by Anonymous Monk on May 12, 2006 at 03:29 UTC
    I did go with Math::Random::MT and it seemed to work pretty well. Of course, I don't really have time to question it at this point. ;)

    thanks
    jason.

      Be reassured. The Mersenne Twister algorithm is very well documented and tested. And the Perl implementation is well tried and tested.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      Hmm. Something's been niggling me since my previous reply to this node.

      It has dawned on you that the MT is still only capable of producing 32-bits of output (on a 32-bit processor), and for the same reasons that scaling the standard rand to produce numbers > 32767 is flawed, scaling the output of MT to produce numbers > 2**32 is equally flawed?

      I just looked back at your OP and you say you were looking for random numbers in the range 10**12. That is roughly 2**44. Ie. > 2**32. Again, there will be a very limited range of values produced from your overall range requirements if you scale MT::rand() to that extent--2**32 / 2**44 * 100 ~= .025%.

      In order to ensure that all the numbers in your range are possible, it becomes neccessary to combine 2 calls to MT::rand() to get enough bits.

      If you generate a random integer in the range 0 .. 2**32-1 and multiply it by 2**32, then you have the potential to produce all the exact powers of 2**32 from 0 .. 2**63. If you then add a second random integer in the range 2**32-1, you have the potential to produce any integer in the range 2**64-1. You can then scale that value to produce integers in the range you require:

      ( r1 * 2**32 + r2 ) * MAX / 2**64 => 0 .. MAX

      The problem is that performing math greater than 2**53 bits in native Perl will introduce errors through IEEE floating point representation errors. You could move to using Math::BigInt, but then you pay the performance penalty.

      An alternative for numbers less than ~ 2**50, is to produce two random integers in the range 0 .. 2**(log2MAX) and combine them by the shifting & adding them as above. Then, when you scale the result, it should keep the math within the bounds of Perl's integer float precision and prevent (most) of the floating point representation errors.

      This bigRand() appears to produce results that are:

      1. within the correct range.
      2. evenly distributed within normal expectations of random numbers.

      The majority of my testing has (of necessity) been confined to using bigRand() to produce fairly small random integers (1e3 through 1e6). The run times and memory/diskspace required for adaquately testing of larger ranges has prohibited extensive testing of these. However, testing at the smaller ranges should be enough to confirm that the math is reasonable, and given the already extensive testing performed upon the MT, then if the math is correct, the results should scale appropriately.

      Please, if the randomness of the output are very important in your work, perform your own testing, to your own satisfaction, before considering this code.

      use Math::Random::MT qw[ srand rand ]; use POSIX qw[ frexp ]; sub bigRand { my $max = shift; my $exp = scalar frexp( $max ); ++$exp if $exp &1; ## Update: Didn't work if exponent +was odd. my $halfbits = $exp / 2; my @rand = map{ rand 2**$halfbits } 1 .. 2; my $rand = ( $rand[ 0 ] * 2**$halfbits ) + $rand[ 1 ]; ## Updated: Use * not << for floats! my $rv = int( $rand * ( $max / 2**$exp ) ); ## Update: Less roundo +ff }

      Does anyone have any suggestions for how to go about testing this for the higher ranges? There are essentially two problems.

      The first is simply the runtime required to produce enough random values in the range (say) 0 .. 10e12. I think it would require at least 10e14 or even 10e15 values to even begin to allow statistical verification. And it's not something that you could do as a distributed process either I think?

      The second is how to accumulate 10e12 counts?


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.