in reply to Re: Re: Random entry from combined data set
in thread Random entry from combined data set

Do you know the number of lines in advance?
(if so, it may be more efficient to randomly select a file weighted by the number of lines in each, then randomly select a line from that file)
  • Comment on Re: Re: Re: Random entry from combined data set

Replies are listed 'Best First'.
Re: Re: Re: Re: Random entry from combined data set
by merlyn (Sage) on Jul 04, 2001 at 12:43 UTC
    Only if the lines are uniformly long, or you have an index to the beginning of each line. The "select a random starting byte" will unfairly bias the lines based on their size (or the size of the line ahead or behind them depending on the algorithm).

    -- Randal L. Schwartz, Perl hacker

      I did not suggest "select a random starting byte"
      I suggested "select a random starting file" (If the number lines in each file is known in advance)
      Then "Select a random line" from that file.