Re^3: improving speed in ngrams algorithm (updated)

Seems like I misread the sample code.

I saw split // not split / /

That's why added the NB part saying to exclude white spaces and punctuation (which isn't done in the OP s code)

I haven't run ° it but the code looks broken to me if the split wasn't meant to be per character. @string holding words doesn't make sense to me!

I don't think that you can effectively process a natural language without regex.

Cheers Rolf
_{(addicted to the Perl Programming Language :)

Wikisyntax for the Monastery
FootballPerl is like chess, only without the dice}

Update

°) I ran it on my mobile and the output shows that the OP is looking for n words in a row. Hence we both misunderstood his definition of n gram

START INDEX: 0 :this is
START INDEX: 1 :is the
START INDEX: 2 :the text
START INDEX: 3 :text to
START INDEX: 4 :to play
START INDEX: 5 :play with
START INDEX: 0 :this is the
START INDEX: 1 :is the text
START INDEX: 2 :the text to
START INDEX: 3 :text to play
START INDEX: 4 :to play with
[download]

Comment on Re^3: improving speed in ngrams algorithm (updated) Select or Download Code