in reply to Re^2: Improving speed match arrays with fuzzy logic
in thread Improving speed match arrays with fuzzy logic
If inflection (plural/singular) differences appear a lot then there is a module which converts a word to plural: Lingua::EN::Inflect. It is very simple to use and not slow. From manpage: print "same\n" if PL_eq($word1, $word2);.
Phonix (and variations, e.g. Metaphone) are algorithms for collapsing words to a phonetic space (which is simpler for retrieval, e.g. w.r.t. spelling). There are modules available for doing that. It's probably worth applying fuzzy-distance to phonetic space and see if that works faster without loss of accuracy.
Apropos MXNet, there is an LSTM module judging from this: https://metacpan.org/source/SKOLYCHEV/AI-MXNet-1.33/examples/char_lstm.pl . If you can make use of something like the procedure described in this: https://machinelearningmastery.com/sequence-classification-lstm-recurrent-neural-networks-python-keras/
Using MXNet's Perl modules requires you to download MXNet from https://mxnet.apache.org/versions/master/install/index.html?platform=Linux&language=Python&processor=CPU and install it, maybe compile it too.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^4: Improving speed match arrays with fuzzy logic
by bliako (Monsignor) on Jan 19, 2019 at 12:00 UTC | |
by erzuuli (Cannon) on Jan 19, 2019 at 16:27 UTC |