I suggest going canonical. You decide (or decide, try and decide again) what things are equivalent, and generate a function to transform all names into a canonical form. If two names colapse onto the same canonical form, then they are at least probably equivalent. If needed, you can then do whatever further tests on the probable matches to find the ones that are definite matches.
In your example, the canonical() sub would maybe replace hyphens in last names by spaces, and eliminate the last initial. The sub doesn't need to return just one cannonical form either, it can return an array of them for different types of matches.
Once you've done that the rest is fairly strightforward.
while ($name = shift @inp_names) {
for $canon ((canonical($name))) { #extra parens force list
# add $name to an anon array, stored under the $canon key
push @{$possible_matches{$canon}}, $name;
}
}
for $canon (keys %possible_matches) {
print "matches under $canon= ",
join(',', @{$possible_matches{$canon}})
"\n";
}
# for illustration. untested
Hope this serves to get you started. Happy matching.
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.