in reply to Split a sentence into words
my @vocabulary = qw(abd abcd abc a bc); my $sentence = 'abdaabc'; my $pattern = join '|', @vocabulary; my @words = $sentence =~ /($pattern)/g;
note that @vocabulary has to be sorted in such a way that "longer" words come earlier; i.e. if word x is a prefix of word y, word y must come earlier in the list.
Upd Does not actually work; i.e. it works only for some vocabularies; say (abcd, abc, de) will not split 'abcde' right. Things get complicated and computer-sciencey. See bart and ikegami's replies below.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Split a sentence into words
by bart (Canon) on May 30, 2009 at 12:27 UTC |