I saw split // not split / /
That's why added the NB part saying to exclude white spaces and punctuation (which isn't done in the OP s code)
I haven't run ° it but the code looks broken to me if the split wasn't meant to be per character. @string holding words doesn't make sense to me!
I don't think that you can effectively process a natural language without regex.
Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery
FootballPerl is like chess, only without the dice
°) I ran it on my mobile and the output shows that the OP is looking for n words in a row. Hence we both misunderstood his definition of n gram
START INDEX: 0 :this is START INDEX: 1 :is the START INDEX: 2 :the text START INDEX: 3 :text to START INDEX: 4 :to play START INDEX: 5 :play with START INDEX: 0 :this is the START INDEX: 1 :is the text START INDEX: 2 :the text to START INDEX: 3 :text to play START INDEX: 4 :to play with
In reply to Re^3: improving speed in ngrams algorithm (updated)
by LanX
in thread improving speed in ngrams algorithm
by IB2017
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |