in reply to Selecting random records from an array

How about this? This seemed simple & quick, perhaps not the most robust/efficient. Any comments are welcome.

@foo = qw( a b c d e f g h i j k l m n o p q r s t u v w y z ); my @feh = sort { rand(1) >= .5 } @foo; # randomly sort my $upper = int( scalar(@feh) * .75 ); # upper index to get 75% of re +cords print join (",", @feh[ 0 .. $upper]); #Example output: #C:\>perl randsort.pl #q,r,p,j,o,d,a,b,c,g,e,h,f,i,l,k,m,n,s #C:\>perl randsort.pl #t,g,r,s,q,o,p,b,h,e,f,a,c,d,m,n,l,k,i

Replies are listed 'Best First'.
Re: Re: Selecting random records from an array
by tall_man (Parson) on Jun 04, 2003 at 22:09 UTC
    That's not a valid sort routine, as "perldoc -f sort" says:

    The comparison function is required to behave. If it returns inconsistent results (sometimes saying $x[1] is less than $x[2] and sometimes saying the opposite, for example) the results are not well- defined.

      ... the results are not well- defined.

      Isn't that exactly why it works (for some definition of the term:)? The whole point of making a random selection is to achieve a "not-well defined" result?

      That said, this 'desort' method of shuffling an array doesn't stand up to analysis for another reason. Statistically, the results are extremely biased as I showed here. Note the abysmal standard deviation of this method (labelled 'qsort', as it was done under 5.6).

      Might be interesting to see what sort[sic] of results you would get using the default mergesort in 5.8.


      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller


      You bring up a great point. Its a new day, and giving it another look I'd imagine that using rand() within a sort algorythm might be bad -- perhaps cause infinite looping because the comparsion between two items keeps changing. Apparently it doesn't though, but perhaps the default sort might with a larger set, or might change in the future. How's this?

      I guess I should only do the rand once and memorize the result.

      my @foo = qw( a b c d e f g h i j k l m n o p q r s t u v w y z ); my %comp; my @feh = sort { if (! defined $comp{$a} ) { if (rand(1) >= .5) { $comp{$a} = 1 } else { $comp{$a} = 0; } } $comp{$a}; } @foo; my $upper = int( scalar(@feh) * .75 ); print join (",", @feh[ 0 .. $upper]); #C:\>perl randsort.pl (output) #b,a,d,c,f,e,h,g,i,j,k,l,m,n,r,q,p,o,s