Greetings all,
Many good comments I thought I might give this one a shot. Here is the methodology I would try.
  1. Create a hash that will be keyed by each of the words in your file the values will be a count of how many times each word (key) appears.
  2. Test that you successfully open your file.
  3. Once opened read the lines of the file one at a time with a while(<FILEHANDLE>){ #logic } loop.
  4. Lowercase all the characters in the line.
  5. With each line replace all the non-word characters with a single space (in case someone did not add a space after a period or between commas), this could be where you deal with your apostrophes as well.
  6. Split the line based on word boundaries (\b I think is the regex character)
  7. Go through the split list word by word if they are longer than four characters and already defined in the hash ++ the hash element keyed by the current word from your split list else add the key to the hash and initialize its value to one.
  8. Once all lines are done sort the hash based on the values. sort keys question is a good discussion on how you can do that.
  9. Print the top ten.
  10. Marvel at the power of perl.

In reply to Re: tutelage needed by injunjoel
in thread tutelage needed by ctp

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.