in reply to Building a new file by filtering a randomized old file on two fields

So you only ever need at most N-random lines?

Here is what i'd do in pseudo code

my @candidates = sort nRandomIntegers( $wantedN ); my @chosen; while( "not done" ){ $line = readline $file; "keep reading if current line is not first candidate";#:) if( satisfyConditions( $lineish , \@chosen ) ){ push @chosen, $lineish; shift @candidates; } "stop reading if chosen wantedN lines"; "pick new candidates when run out of candidates "; "rewind filehandle if reach the end but not wantedN"; "give up, rewinded three times, never gonna happen"; }

Does that make sense? Any questions?

  • Comment on Re: Building a new file by filtering a randomized old file on two fields
  • Download Code