in reply to String Comparison & Equivalence Challenge
> All of this depends on being able, first and foremost, to measure the equivalence of two different strings.
I think you want to take a look at tf-idf (term frequency-inverse document frequency) in combination with a stemmer.
And you might also want to rank partial word groups like sub-phrases to take word order into account.
This should give you a start.
HTH :)
Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery
See also Re^5: String Comparison & Equivalence Challenge (tf-idf)
|
|---|