The code I gave you was written on the fly on the command line rather than being stored in a script file. Therefore it would require a little modification to be used as a stored script. The enclosing single quotes around the code and the -E flag would go and the command line -M flags would be incorporated in the script as

use strict; use warnings;

lines at the top of your code. I use q{...} and qq{...} instead of '...' and "..." because it makes it easier to write code on the command line in both Unix/Linux and MS Windows environments but they are fuctionally equivalent.

Some points about your translation:-

You could employ a do block to get the total number of hits by changing

$words{$1} ++ while $text =~ m{$rxWords}g;

to

my $totalHits; do { $totalHits ++; $words{$1} ++; } while $text =~ m{$rxWords}g;

I'll leave you to see if you can work out how to get the total number of words given these clues; the regex pattern \b\w+\b and the g match modifier. Play around with some simple test text and see if you can solve the problem for yourself then apply it to your real code. Doing is far and away the best way of learning!

I hope this helps you move forward.

Cheers,

JohnGG


In reply to Re^3: count number of overlapping words in a document by johngg
in thread count number of overlapping words in a document by dmarcel

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.