in reply to Re^3: Fuzzy text matching... again
in thread Fuzzy text matching... again
I can't remember mentioning names anywhere in my response.
My last paragraph wasn't meant as a reply to anything you've said in particular, more as a "P.S." — Sorry for not having made it clear.
...these were publication citations making the "what is a name" germane but I don't see why it can't just be treated like any other token.
I just wanted to point out that if you understand what is what in "Archivio Giuliano Marini" you have a much better chance of telling if something else like "Archivio Giuliano Cassini" is referring to the same thing or not. I.e., knowing that "Archivio" simply means "archive" and that "Giuliano" is a common given name, you'd most likely figure out that they're two different archives, while a simple token comparison might identify them as being the same (due to two of three tokens matching, and the third one sounding similar).
|
|---|