... there's also the quite real possiblity of 4 never being returned from rand!
In that case, the pseudo random number generator would be broken (taking 'never' literally).
In theory, it might occasionally take a 'long time' for any certain number
to occur, but that's unlikely in practice.
Actually, presuming a statistically equal distribution of values
returned by the PRNG, the probability to get a certain (integer) number is expected to be 1/N (N being
32 in this case). So, even on the last turn, when all but one of the
32 values are already taken, the chance to get the final remaining
number is 1/32 (or 32 attempts, on average); when there are still two
numbers left, the chance is 2/32 (16 attempts), etc.
Doing some simple simulation statistics confirms this quite well
(although slightly modified in implementation, the central algorithm is
functionally the same as that of the OP):
#!/usr/bin/perl
use strict;
use warnings;
my $n = 1e4; # number of runs for the statistics
my $N = 32;
my @sum;
my @max = (0)x($N+1);
for (1..$n) {
my @have_num = ();
for my $i (1..$N) {
my $attempts = 0;
while(1) {
my $rand = int(rand($N));
#print ".";
$attempts++;
unless ($have_num[$rand]) {
#print " $i: $rand\n";
$have_num[$rand] = 1;
last;
}
}
$sum[$i] += $attempts;
$max[$i] = $attempts if $attempts > $max[$i];
}
}
for my $i (1..$N) {
my $avg = $sum[$i]/$n;
my $bar = "*" x int($avg+0.5);
printf "%2d: %6.3f [%6.3f] %s max=%d\n",
$i, $avg, $N/($N-$i+1), $bar, $max[$i];
}
__END__
1: 1.000 [ 1.000] * max=1
2: 1.033 [ 1.032] * max=4
3: 1.065 [ 1.067] * max=4
4: 1.104 [ 1.103] * max=5
5: 1.142 [ 1.143] * max=7
6: 1.186 [ 1.185] * max=7
7: 1.230 [ 1.231] * max=8
8: 1.283 [ 1.280] * max=8
9: 1.334 [ 1.333] * max=8
10: 1.392 [ 1.391] * max=10
11: 1.453 [ 1.455] * max=10
12: 1.521 [ 1.524] ** max=11
13: 1.601 [ 1.600] ** max=13
14: 1.684 [ 1.684] ** max=13
15: 1.779 [ 1.778] ** max=14
16: 1.882 [ 1.882] ** max=16
17: 1.999 [ 2.000] ** max=19
18: 2.137 [ 2.133] ** max=17
19: 2.283 [ 2.286] ** max=23
20: 2.462 [ 2.462] ** max=20
21: 2.674 [ 2.667] *** max=26
22: 2.904 [ 2.909] *** max=29
23: 3.198 [ 3.200] *** max=31
24: 3.551 [ 3.556] **** max=37
25: 3.996 [ 4.000] **** max=48
26: 4.554 [ 4.571] ***** max=47
27: 5.331 [ 5.333] ***** max=55
28: 6.400 [ 6.400] ****** max=79
29: 8.026 [ 8.000] ******** max=95
30: 10.597 [10.667] *********** max=140
31: 15.983 [16.000] **************** max=211
32: 32.092 [32.000] ******************************** max=343
The second column is the number of attempts, averaged over 10000 runs
in this case (a single run doing what the OP's code did). The values in brackets are the corresponding expected values.
"max=" is the worst case number of attempts that
ever occurred.
|