http://qs1969.pair.com?node_id=562145


in reply to grabbing random n rows from a file

I think this method is accurate and fair:
open my $some_filehandle, "<", "quotefile.txt"; my $set_size = 3; my $set = random_set_of_n($some_filehandle, $set_size); sub random_set_of_n { my ($fh, $size) = @_; my @set; local ($., $_); seek $fh, 0, 0; while (<$fh>) { chomp; push @set, $_; last if @set == $size; } # XXX: @set *should* be shuffled now if you care about ordering while (<$fh>) { chomp; $set[rand @set] = $_ if $size/$. > rand; } return \@set; }
I think it's a fair distribution. My tests imply it is. Update: the set should be shuffled where I've indicated. It's not necessary if you're going to be plucking elements from it at random later on, though, only if you want a randomly ordered list returned.

Jeff japhy Pinyan, P.L., P.M., P.O.D, X.S.: Perl, regex, and perl hacker
How can we ever be the sold short or the cheated, we who for every service have long ago been overpaid? ~~ Meister Eckhart