Please ignore! Misunderstood question.

My answer treats ngrams on characters not words.


A regex should be faster, this demo in the debugger for n=3 should give you a start.

DB<30> $str = join "", a..l DB<31> @res=() DB<32> for my $start (0..2) { pos($str) =$start; push @res, $str =~ +m/(.{3})/g } DB<33> x @res 0 'abc' 1 'def' 2 'ghi' 3 'jkl' 4 'bcd' 5 'efg' 6 'hij' 7 'cde' 8 'fgh' 9 'ijk'

NB:

(I know it's possible in a single regex without looping over start by playing around with \K or similar. I'll leave it to the regex gurus like tybalt to show it ;-)

HTH! :)

Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery FootballPerl is like chess, only without the dice

update

In case you want really want to include non-letters try unpack


In reply to Re: improving speed in ngrams algorithm (updated) by LanX
in thread improving speed in ngrams algorithm by IB2017

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.