in reply to String Comparison & Equivalence Challenge

Good morning :)

> All of this depends on being able, first and foremost, to measure the equivalence of two different strings.

I think you want to take a look at tf-idf (term frequency-inverse document frequency) in combination with a stemmer.

And you might also want to rank partial word groups like sub-phrases to take word order into account.

This should give you a start.

HTH :)

Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery

update

See also Re^5: String Comparison & Equivalence Challenge (tf-idf)

  • Comment on Re: String Comparison & Equivalence Challenge (tf-idf)