Given string with text, I need to create n-grams of predefined lengths. I came up with the following. Any suggestions on how to improve it (being speed an important factor in my process?). The sentence, i.e. the array will contain typically 5-15 elements.
use strict; use warnings; my $sentence = "this is the text to play with"; my @string = split / /, $sentence; my $ngramWindow_MIN = 2; my $ngramWindow_MAX = 3; for ($ngramWindow_MIN .. $ngramWindow_MAX){ my $ngramWindow=$_; my $sizeString = (@string) - $ngramWindow; foreach (0 .. $sizeString){ print "START INDEX: $_ :"; print "@string[$_..($_+$ngramWindow-1)]\n"; } }
In reply to improving speed in ngrams algorithm by IB2017
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |