carlos fandango has asked for the wisdom of the Perl Monks concerning the following question:

This should be really simple, but it's driving me nuts. The code below should calculate how many errors you would get per million iterations at a given rate of error. It does the calculation 5 times, just to demonstrate the distribution of errors.
for ($x=10; $x<1000000; $x=$x*10){ $error_rate = $x; print "Error rate = 1 every $x, errors = "; for ($y=0;$y<5;$y++){ errortest(); } print "\n"; } sub errortest { $errors = 0; for ($n=0;$n<1000000;$n++){ $random = int (rand($error_rate)); if ($random == 1){$errors++} } print $errors, " "; } Output: Error rate = 1 every 10, errors = 99778 100371 99586 99912 99778 Error rate = 1 every 100, errors = 10007 9962 9803 10063 10132 Error rate = 1 every 1000, errors = 1004 1019 1049 1011 945 Error rate = 1 every 10000, errors = 90 84 87 79 83 Error rate = 1 every 100000, errors = 0 0 0 0 0
My question: why aren't the values correct? You'd expect 100 errors at a 1 in 10000 error rate, not the 84 (average) above. At a 1 in 100000 rate you'd expect 10, not 0. What's going on here - I'm going crazy. This should be so simple ...

Replies are listed 'Best First'.
Re: Math all gone wrong...
by Zaxo (Archbishop) on Dec 01, 2002 at 01:49 UTC

    If you're only expecting 100 errors, the variance will be about 10. Your results are indeed low for the expected rates in the low-rate calculations. How good is your random number generator?.

    I got:

    Error rate = 1 every 10, errors = 100046 99765 100442 99969 99433 
    Error rate = 1 every 100, errors = 10134 9873 10067 10072 9842 
    Error rate = 1 every 1000, errors = 1012 1056 1015 986 1055 
    Error rate = 1 every 10000, errors = 103 101 105 93 96 
    Error rate = 1 every 100000, errors = 8 14 10 14 11
    
    without any modification of your code.

    Update: Athlon/Linux 2.4.17/Perl 5.6.1 for this run. /dev/random is available.

    After Compline,
    Zaxo

      My results:
      Error rate = 1 every 10, errors = 100098 100545 99931 99905 99434 
      Error rate = 1 every 100, errors = 10037 10075 9872 9924 10133 
      Error rate = 1 every 1000, errors = 959 978 1012 1032 1041 
      Error rate = 1 every 10000, errors = 97 100 109 92 86 
      Error rate = 1 every 100000, errors = 11 11 9 12 7
      

      What are you running this on (Machine/OS/Perl Version)? Also, rule of thumb is 30 trials for reliable statistical results. 100 is a safer bet.

      Update:(PIII/2.4.18/5.6.1 w/ dev/random) Thanks Zaxo, although to clarify, I was most interested in what the original poster was running on.

      On an entirely different track, if you need real random numbers (as opposed to calculated pseudo-random numbers) have a look at

      http://www.random.org/essay.html
      or
      http://www.fourmilab.ch/hotbits/

      CountZero

      "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

Re: Math all gone wrong...
by pg (Canon) on Dec 01, 2002 at 01:34 UTC
    Some insight knowledge of how ANSI C generates random numbers on a 32-bit machine:

    this_seed = (last_seed * 69069) % 2 ** 32; (equation 1)

    this_random_number = this_seed / 2 ** 32; (equation 2)

    Every time you call rand, it first generates a new seed according to quation 1, then generates the random number following equation 2.

    You can determine the initial seed by calling srand.

    This is just how ANSI C does it. Obviously there are other ways.

    Want more? Try the following set of equations, which belongs to the same "family" (this is the term we used to group algorithms generating random numbers, as you can see this family is linear) as the above set:

    this_seed = (last_seed * 65539) % 2 ** 31; (equation 1)

    this_random_number = this_seed / 2 ** 31; (equation 2)

Re: Math all gone wrong...
by thewalledcity (Friar) on Dec 01, 2002 at 01:44 UTC
    You are not doing enough trials. If you did 10000 trials you would see that the average number of errors approaches 100.
Re: Math all gone wrong...
by ibanix (Hermit) on Dec 01, 2002 at 03:42 UTC
    Any bets he's using ActivePerl? Here's the output I get on under ActivePerl 633 (5.6.1.633) on Win2k:

    Error rate = 1 every 10, errors = 100246 100029 99946 99592 100069 Error rate = 1 every 100, errors = 9935 10090 10073 9879 10043 Error rate = 1 every 1000, errors = 998 973 966 967 1003 Error rate = 1 every 10000, errors = 96 96 75 93 97 Error rate = 1 every 100000, errors = 0 0 0 0 0



    <-> In general, we find that those who disparage a given operating system, language, or philosophy have never had to use it in practice. <->
      Hmm - thanks all - you've hit the nail on the head. I'm running this on a Win98 machine using ActivePerl. The prizes go to all those who bet that way. Clearly ActivePerl has a screwy rand function, since the code works fine on the others (many many thanks to those above who tried it on another system and showed me that I'm not going insane). Is there a way to fix this in ActivePerl, or am I doomed to reduced probabilities?
        Now fixed (see modified code and new output below) on ActivePerl, thanks to the previous replies. Has this been a common problem in ActivePerl? I'm pretty new to this, and naively thought the rand function would be platform independent.
        $seed = rand(100); for ($x=10; $x<10000000; $x=$x*10){ $error_rate = $x; print "Error rate = 1 every $x, errors = "; for ($y=0;$y<5;$y++){ errortest(); } print "\n"; } sub errortest { $errors = 0; for ($n=0;$n<1000000;$n++){ $random = int (better_random($error_rate)); if ($random == 1){$errors++} } print $errors, " "; } sub better_random { $seed = ($seed * 65539) % 2 ** 31; $this_random = $_[0] * ($seed / 2 ** 31); return $this_random; } Ouput: Error rate = 1 every 10, errors = 99259 100049 99986 99998 99818 Error rate = 1 every 100, errors = 10166 10098 10056 10107 10032 Error rate = 1 every 1000, errors = 967 1013 1009 948 1004 Error rate = 1 every 10000, errors = 96 108 112 116 104 Error rate = 1 every 100000, errors = 16 11 7 8 10 Error rate = 1 every 1000000, errors = 0 3 1 1 0