Here's a routine that should return a random line from a file, without needing to read the entire contents of the file into memory:
my $file = '/usr/dict/words'; #selecting a random line from a file my $line = do { open FH, "<$file" or die "Could not open $file: $!"; #Go to a random spot in the file seek FH, rand((-s FH) + 1), 0; #This forces us to the next line, because #we might be positioned in the middle of #a line in FH. <FH>; #If the next call to <FH> is about to #return a part of the last line in the #file, then wrap around to the beginning #of the file. We ONLY want complete lines. seek(FH, 0, 0) if eof; #Return the next available, complete, line <FH>; };

Here's an explanation of the code:

Most similar routines will break when they randomly select the last line of a file. This is because they do the first <FH> , which forces the file cursor to start at the next line. The only problem is that if you are already positioned somewhere inside the *last* line, the second access of <FH> will return undef, which is not usually what you want.

Another flaw, with the standard routine of this type, is that we are always looking one line ahead of the line that was randomly seek'd to. This means, that no matter how many times you run the routine, it will *always* miss line 1.

The above routine gets around that by simly asking to see if we are at the end of the file, using eof. If we are, then we wrap to the beginning. This solves both the problems above, while still maintaining an acceptable level of randomness with rand.

Update: This routine does have a bias towards longer lines in the file. Do not use it for files where the fields are variable length. Use the methods described here: How do I pick a random line from a file?.


In reply to Re: Random Line from a file by dkubb
in thread RandomFile by jettero

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.