Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re: String Comparison & Equivalence Challenge (tf-idf)

by LanX (Saint)
on Mar 14, 2021 at 07:34 UTC ( [id://11129603]=note: print w/replies, xml ) Need Help??


in reply to String Comparison & Equivalence Challenge

Good morning :)

> All of this depends on being able, first and foremost, to measure the equivalence of two different strings.

I think you want to take a look at tf-idf (term frequency-inverse document frequency) in combination with a stemmer.

And you might also want to rank partial word groups like sub-phrases to take word order into account.

This should give you a start.

HTH :)

Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery

update

See also Re^5: String Comparison & Equivalence Challenge (tf-idf)

  • Comment on Re: String Comparison & Equivalence Challenge (tf-idf)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11129603]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (6)
As of 2024-04-16 11:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found