in reply to Sample Probabilities
Continuing with that example, the probability of getting green,red,red, in that order, is:
All we have to do now is multiply this quantity by the number of ways to rearrange (green,red,red). In this case, there are 3 ways to rearrange them.(41/92)*(35/91)*(34/90)
In general, the probability of getting x1 red, x2 green, x3 blue, etc, in some particular order, is the following: (I separated out the numerator and denominator to make it easier)
And the number of ways of reordering a list of x1 reds, x2 greens, x3 blues, is(#red * (#red-1) * ... (#red-x1+1)) * (#green * (#green-1) * ... (#green-x2+1) * ... / ( total * (total-1) * (total-2) * ... )
Those exclamation points are for factorials.(x1+x2+x3+ ...)! / (x1! * x2! * x3! * ... )
As perl code, and using the example in your original post (I changed the indexing to be 0-based):
Disclaimer: This code not tested in any great detail. Also keep in mind that all the probabilities will look small, because there are so many different events in this distribution.use List::Util 'sum'; sub fact { my $n = shift; my $f = 1; $f *= $_ for 1 .. $n; $f; } ## $n * ($n-1) * ... * ($n-$m+1) sub lower { my ($n, $m) = @_; my $result = 1; $result *= $n-$_ for 0 .. $m-1; $result; } sub prob { my ($sizes, $sample) = @_; my $n = @$sizes; ## how many types of items my $S = sum @$sizes; ## total number of items in the universe my @sampled = (0) x $n; my $samplesize = @$sample; $sampled[$_]++ for @$sample; ## $sampled[x] = how many of type x ## are in this sample? my $total = fact($samplesize) / lower($S, $samplesize); $total *= lower($sizes->[$_], $sampled[$_]) / fact($sampled[$_]) for 0 .. $n-1; $total; } ## how many ways to get 2 of type 8, 1 of type 3, etc.. ? print prob( [35000,41000,16000,18000,21000,45000,27000,10000,16000], [8,8,3,1,5,0,0] ); ## output: 0.000397344651946416
Update: updated computation to account for drawing without replacement. Old code is still in comment tags. The probabilities hardly change at all since the domain is so large.
Update: fixed copy & paste error, thanks to BrowserUk++.
blokhead
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Sample Probabilities
by BrowserUk (Patriarch) on Dec 08, 2006 at 07:20 UTC | |
|
Re^2: Sample Probabilities
by willyyam (Priest) on Dec 07, 2006 at 19:11 UTC |