in reply to Re: compare strings intuitively
in thread compare strings intuitively

i'm sorting through a bunch of data with an author field. sometimes an author could be represented as 'a name' and other times 'am name', things like that. my goal is to try and work out which authors are the same people.
"what matters to you is the number of operations required to turn one string into another" - yeah thats pretty much right but with a few exceptions.
'a name' is obviously different from 'b name'
but
'am name' could be 'a name'
the substitution method wouldnt work on its own here.

Replies are listed 'Best First'.
Re^3: compare strings intuitively
by derby (Abbot) on May 07, 2007 at 12:18 UTC

    Good luck ... this is a tough nut to crack with any non-trivial set of data - the number of false positives is going to be high. Uncommon names will work well but the m. smiths of the world are not going to be happy campers.

    -derby