in reply to Searching for best match
How would you proceed to find the best match?
That depends somewhat on your definition of "best", and some more explanation and examples would help, but I'm going to guess it's the longest one?
for my $s (@search) { my @names = split ' ', $s; # this reqires at least first two names to match my @matches = grep { /^\Q$names[0]\E\s+\Q$names[1]\E\b/ } @source; @matches = sort {length($b)<=>length($a)} @matches; print "search='$s'\n"; print "\tfound='$_'\n" for @matches; } __END__ search='John Ronald Reuel T' found='John Ronald Reuel Tolkien' found='John Ronald S Tolkien' search='Trent Reznor' found='Trent Reznor' search='Barack Hussein II' found='Barack Hussein Obama II' found='Barack Hussein II' search='Barack Hussein Obama II' found='Barack Hussein Obama II' found='Barack Hussein II' search='No match here'
Just a note, using \w+ to match a name may not be enough, since it might not include all the characters you would consider part of a name (for example, in ASCII it doesn't include the dot, as in "Jr." or "Sr."). That's why the code above takes the alternative approach of splitting on whitespace. However, even that might not be enough, and you should probably look into the Lingua:: namespace on CPAN. For example, a quick search brings up Lingua::EN::MatchNames and Lingua::EN::NameParse.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Searching for best match
by Sosi (Sexton) on Oct 06, 2014 at 12:57 UTC | |
|
Re^2: Searching for best match
by Sosi (Sexton) on Oct 06, 2014 at 12:47 UTC | |
by Anonymous Monk on Oct 06, 2014 at 15:39 UTC |