While I was thinking about methods that don't involve preprocessing, a nice solution to your problem hit me smack in the face. I didn't even see it coming.

You could create a file that includes nothing but the locations of the beginning of each line in /usr/dict/words. It would be big, but it would give you simple, efficient, and very effective solutions to your problem and many similar ones.

For example, your problem would be solved by the following:

1. Read a certain number of random locations from the index file (easy with constant-length records).
2. Check each location (in /usr/dict/words) until you find an n-letter word. If you don't find one, go back to step 1.

The number you of locations you read in step 1 would have to be optimized for performance (reading time vs. how often the algorithm must be restarted), and might be variable by the length of word you want. For instance, you could hard-code an array @num in your script so that reading $num(n) locations would locate an n-letter word 95% of the time, and the algorithm would be restarted the remaining 5% of the time. The numbers wouldn't be very big, and you wouldn't have to be right on the optimum solution to get good performance.

The best thing about this solution is that the index file would be useful for almost any program that needs to get random words from a certain subset of /usr/dict/words. If you have any other problems similar to the one you posted, this might be the way to go.


In reply to Re: RandomFile (Simple preprocessor) by grackle
in thread RandomFile by jettero

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.