in reply to selecting N random lines from a file in one pass
This can be beaten. But not by much...use strict; my $size = shift @ARGV || 3; # sample size my @sample; # read n lines for (1..$size) { push @sample, scalar <>; } # shuffle - I'll do it in pure perl. my $i = $size; while ($i--) { my $j = int rand($i + 1); @sample[$i, $j] = @sample[$j, $i]; } # now read the rest of the lines in. while (<>) { my $choice = int rand($.); if ($choice < $size) { $sample[$choice] = $_; } } # and minimize work by chomping as few lines as possible. chomp(@sample);
Update: Per the note by AM below, fix off-by-one error.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: selecting N random lines from a file in one pass
by Anonymous Monk on Dec 23, 2004 at 13:04 UTC |