Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

The code below is supposed to find the quantity/number of records that 25% represents in array @total and then select that number of records from array @sub randomly. Is this code alright?

$n = @total * .25;
$m = @sub;
@select = grep rand($m--)<$n?$n--:0, @sub;


I tested this on my side and it 'seems' to work fine. However, I want to be sure that I'm doing the right thing here.

Thanks,
Ralph.
  • Comment on Selecting array records - code question

Replies are listed 'Best First'.
Re: Selecting array records - code question
by dash2 (Hermit) on Jun 10, 2003 at 00:17 UTC
    Seems a little weird to me. Suppose you have 8 elements, so $n is about 2 and $m is 8. The first element of sub has a 2 in 8 chance of being selected. If it is selected, $n is now 1 and $m is 7, so the next element has a 1 in 7 chance of being selected. If it isn't selected, the next element has a 2 in 7 chance of being selected. In other words, you are going to get the right number of elements returned, but not all elements have an equal chance of being returned.

    (Or am I being stupid?)

    I suggest that your third line could be:

    while (@selected < $n) { push @selected, splice(@sub, int rand @sub, 1); }

    which should ensure an even distribution, is more readable, and only iterates $n times rather than through the whole @sub array. (Copy @sub if you need to preserve the original.)

    andramoiennepemousapolutropon