Since I'm a not an expert with perl regex, I start digging in the code of haukex with commenting it's original code and with searching a simpler loop.

Here we go:

use warnings; use strict; use Test::More tests=>2; my $str = "iowq john stepy andy anne alic bert stepy anne bert andy st +ep alic andy"; my %names; =for comment pos Returns the offset of where the last m//g search left off for the vari +able in question ($_ is used when the variable is not specified). Note that 0 is a valid match offset. undef indicates that the search position is reset (usually due to matc +h failure, but can also be because no match has yet been run on the s +calar). =cut pos($str)=undef; =for comment https://www.regular-expressions.info/continue.html The position where the last match ended is a "magical" value that is r +emembered separately for each string variable. The position is not associated with any regular expression. This means that you can use \G to make a regex continue in a subject s +tring where another regex left off. If a match attempt fails, the stored position for \G is reset to the s +tart of the string. To avoid this, specify the continuation modifier +/c. =cut while ($str=~/\G #start where the last match ended \s* #match 0 to n space char (\S+) #remember any non space char after that and followed by (?: #start clustering of \s+|\z #1 to n spaces or the end of the string ) #end clustering /gcx) { $names{$1}++; } die "failed to parse \$str" unless pos($str)==length($str); test_it (\%names); %names = (); #Takes a new variable #my $str2 = "iowq john stepy andy anne alic bert stepy anne bert andy +step alic andy"; #or reset pos for the original var pos($str)=undef; my $last; while ($str=~/(\w+)/g) { #print $1, " ", pos $str, "\n"; $names{$1}++; $last = pos $str; } die "failed to parse \$str" unless $last ==length($str); test_it(\%names); sub test_it { my $hr_names = shift; is_deeply $hr_names, { alic => 2, andy => 3, anne => 2, bert => 2, iowq => 1, john => 1, step => 1, stepy => 2 }; }
I have 3 questions

Cheers

François


In reply to Re^2: counting words in string by frazap
in thread counting words in string by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.