in reply to Re^9: Random sampling a variable length file.
in thread Random sampling a variable record-length file.

Taken together, these make even the extreme case just as amenable to this method as any other. If you remember which records you've hit and do not re-sample them, you're simply omitting a segment of the number line from a uniform distribution. The distributions on either side are still uniform, i.e., random.

Thankyou again! That makes a great deal of sense.

My first reaction was that remembering whether I had already picked a record was an awkward prospect given I olny have the byte position and no nknowledge of how long it is, then it dawned on me querying the offset once I've read the partial record make for a perfect signature.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
"I'd rather go naked than blow up my ass"
  • Comment on Re^10: Random sampling a variable length file.