in reply to Read a column; Compare strings for close (not exact) matches

You could try one of the /sounds like/ type of modules from CPAN. Text::Metaphone and Text::Soundex are two that I have used. Both have limitations. It really depends on what you are willing to consider a "close match". Do you want 'Don Banks' to be a close match for 'Dawn Binx'? You may get odd results.

Alternately, you will need to better define what constitutes a "close match". Is it something that a human would recognize, but is hard to define in programmatic ways? Off-by-one type matches you may be able to program, but it could be error-prone and expensive (time intensive to run).

Any additional info you can supply here would help us better guage what you are trying to accomplish.

Ivan Heffner
Sr. Software Engineer, DAS Lead
WhitePages.com, Inc.
  • Comment on Re: Read a column; Compare strings for close (not exact) matches

Replies are listed 'Best First'.
Re^2: Read a column; Compare strings for close (not exact) matches
by eyidearie (Novice) on Jul 07, 2005 at 19:03 UTC
    Hi, Thanks everyone, I looked at the soundex module, but somewhere it said that it only considered English words and pronounciations... some of the names I have to work with are Indian, Chinese and even African :-( The problem in more details: I have a group of people calling two different helpdesks. They give their names to be identified. I need to find out which people call both helpdesks, and I can only use their names as identification: they do not give their IDs, and their departments change quite often so that would be not useful to compare. But these can be spelled wrongly by the helpdesk personnel; also, for example, a Robert Carlos could give his name as Bob Carlos sometimes :-( I don't know that I can do much about short forms of names, but I would like different spellings to be identified as the same person as much as possible. Thank you very much, Eyi.
      Lingua::EN::MatchNames would be a start - I'm not sure how much support there is for indian/chinese/african names