http://qs1969.pair.com?node_id=388533

johnnywang has asked for the wisdom of the Perl Monks concerning the following question:

This must be well-known, but I can't seem to find a good answer: given a range and size, what's the best way to find a set of distinct random numbers? i.e.
sub distinct_random_int{ my($num, $start,$end) = @_; #return an arry of size $num }
Updated: changed "different" to "distinct" after Aristotle's remark.

Replies are listed 'Best First'.
Re: How to generate different random numbers?
by BrowserUk (Patriarch) on Sep 04, 2004 at 23:18 UTC

    This should probably have some error checking to make sure that $start is less than $end, and that $end-$start >= $num. If your intent is that $end shoudl be inclusive it will need small tweak also.

    #! perl -slw use strict; sub distinct_random_int{ my($num, $start,$end) = @_; my %uniq; $uniq{ $start + int( rand $end-$start ) } = undef while keys %uniq < $num; return keys %uniq; } print for distinct_random_int 10, 20, 30; __END__ P:\test>388533 27 25 28 21 26 20 22 24 23 29

    That works, but if $num is close to being the same size as the range, then you would be better to shuffle the range and then select the first $num elements:

    #! perl -slw use strict; use List::Util qw[ shuffle ]; sub distinct_random_int{ my($num, $start,$end) = @_; return ( shuffle $start .. $end )[ 1 .. $num ]; } print for distinct_random_int 10, 20, 30;

    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
    "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon
Re: How to generate different random numbers?
by Zaxo (Archbishop) on Sep 04, 2004 at 23:20 UTC

    You reduce randomness slightly my insisting on no duplication. Here's the code:

    sub distinct_random_int { my($num, $start,$end) = @_; my ($range, %nums) = $end - $start; if ($num < $range) { $nums{$start + int rand $range} = undef while keys(%nums) < $num; return keys %nums; } else { return; } }
    Storing in hash keys gets us uniqueness.

    After Compline,
    Zaxo

      No he doesn't. What he's actually asking for is a random shuffle. There are well known O(n) time and memory algorithms for that sort of thing. Look up Fischer-Yates.

      update: of course a shuffle won't work so well if the range is large

        No he doesn't.
        I'm going to go ahead and disagree with you there, cowboy. Any time you impose a restriction on random, you've made it not random. A random shuffle of a known interval isn't a set of random numbers. Woe unto he who uses the "random" numbers for cryptographic purposes.

        thor

        Feel the white light, the light within
        Be your own disciple, fan the sparks of will
        For all of us waiting, your kingdom will come

      It might be more helpful to return shuffled ints between $start and $end, even if $num < $range. The user just won't get back $num ints. Perhaps the $range+1 returned value could be "Range too small."

      As for shuffle vs. rand, "use Benchmark". :-) See if there's a predictable threshold where one overtakes the other.

      Update: Is the ratio between $num and $range significant when determining whether a shuffle or rand is better to use? It would seems that a very small $num:$range ratio would benefit from rand, where shuffle-ing is better as $num:$range -> 1.

      Or am I speaking from a non-standard orifice?

        hi johnnywang,
        As soon as you talk about 'distinct', 'random' goes out of the window. So i assume you need distinct numbers upon which you can't apply any pattern. The best way is to write a hash function on the current date + time which ofcourse must not give always increasing or decreasing numbers. Use google to find a suitable algorithm ( i am sure people have done it before).
Re: How to generate different random numbers?
by Aristotle (Chancellor) on Sep 04, 2004 at 23:13 UTC

    Different how? That problem description is incomplete.

    We can't read your mind.

    Makeshifts last the longest.

      I meant: distinct, i.e, the returned list has no repeats.

        Ah, your function name hints at that.

        What's the "range" about? A range in which the random numbers should fall?

        Makeshifts last the longest.

Re: How to generate distinct random numbers?
by Aristotle (Chancellor) on Sep 05, 2004 at 12:42 UTC

    Just for the record, since everyone's talking about it but noone has shown any code:

    use List::Util qw( shuffle ); sub distinct_random_int { my ( $num, $start, $end ) = @_; return if $start > $end or $num > $end - $start + 1; ( shuffle $start .. $end )[ 0 .. $num - 1 ]; }

    Update: indeed, so long as the size of the range and the number of elements requested is within an order of magnitude, this is faster than the hash&retry solutions, in my tests by about 20%. If the discrepancy grows any larger, the hash&retry approach becomes much cheaper very quickly.

    use strict; use warnings; use Benchmark qw( cmpthese ); use List::Util qw( shuffle ); sub randseq_shuffle { my ( $num, $start, $end ) = @_; return if $start > $end or $num > $end - $start + 1; ( shuffle $start .. $end )[ 0 .. $num - 1 ]; } sub randseq_retry { my ( $num, $start, $end ) = @_; return if $start > $end or $num > $end - $start + 1; my %seen; my $range = $end - $start; my $base = $start; undef $seen{ $base + int rand $range } until $num == keys %seen; keys %seen; } for ( [ 5, 10, 30 ], [ 50, 100, 300 ], [ 500, 1000, 3000 ], [ 5000, 10000, 30000 ], [ 50000, 100000, 300000 ], [ 5000, 10000, 30000 ], [ 500, 10000, 30000 ], [ 50, 10000, 30000 ], [ 5, 10000, 30000 ], ) { my ( $n, $s, $e ) = @$_; cmpthese -1 => { "shuffle($n,$s,$e)" => sub { randseq_shuffle $n, $s, $e }, "retry($n,$s,$e)" => sub { randseq_retry $n, $s, $e }, }; }

    Makeshifts last the longest.

Re: How to generate distinct random numbers?
by johnnywang (Priest) on Sep 05, 2004 at 06:43 UTC
    Thank you all, actually I was asking for the simple question of getting some distinct random numbers, I guess the range throw you guys off, into shuffles. I somehow didn't feel good about: "keep trying until one gets enough distinct ones", but it is enough for my current purpose. Thanks again.
Re: How to generate distinct random numbers?
by ysth (Canon) on Sep 06, 2004 at 05:23 UTC
    This may or may not be appropriate; you haven't really given enough information about how this would be used. (Are you looking for integers or floating point? Is the range going to be in the hundreds or billions? etc.)
    sub distinct_random_int { my ($num, $start, $end) = @_; my @rand; my @avail = $start..$end; push @rand, splice(@avail, rand(@avail), 1) for 1..$num; @rand; }

      That is not a fair algorithm. @avail shrinks during your loop, so the range rand(@avail) picks from does as well, and thus chances for any one number to be picked increase with each iteration.

      Makeshifts last the longest.

        No.