Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:
Hi,
The code below is supposed to find the quantity/number of records that 25% represents in array @total and then select that number of records from array @sub randomly. Is this code alright?
Seems a little weird to me. Suppose you have 8 elements, so $n is about 2 and $m is 8. The first element of sub
has a 2 in 8 chance of being selected. If it is selected, $n is now 1 and $m is 7, so the next element has a 1 in 7 chance of being selected. If it isn't selected, the next element has a 2 in 7 chance of being selected. In other words, you are going to get the right number of elements returned, but not all elements have an equal chance of being returned.
(Or am I being stupid?)
I suggest that your third line could be:
while (@selected < $n) {
push @selected, splice(@sub, int rand @sub, 1);
}
which should ensure an even distribution, is more readable, and only iterates $n times rather than through the whole @sub array. (Copy @sub if you need to preserve the original.)