Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Proper case for names

by nop (Hermit)
on Sep 01, 2000 at 14:10 UTC ( [id://30699]=perlquestion: print w/replies, xml ) Need Help??

nop has asked for the wisdom of the Perl Monks concerning the following question:

Hi. I am looking for a snippet or module to proper case names, and I didn't see anything in CPAN. It seems simple, but there are special cases. Anyone have code they can share?
Examples
Fred Smith-Barney III is right, Fred Smith-barney Iii isn't
Bobby McPhillips is right, Bobby Mcphillips isn't
Lisa Top, PhD is right, Lisa Top, Phd isn't
etc
Thanks

Replies are listed 'Best First'.
Re: Proper case for names
by t0mas (Priest) on Sep 01, 2000 at 14:30 UTC
Re: Proper case for names
by KM (Priest) on Sep 01, 2000 at 18:30 UTC
    Look at Lingua::EN::NameCase and Lingua::EN::NameParse.

    Cheers,
    KM

Buzzcutbuddha (Too much variation in names) - RE: Proper case for names
by buzzcutbuddha (Chaplain) on Sep 01, 2000 at 16:31 UTC
    Merlyn said it in this node RE: Uppercase First Letter w/exceptions, and I'll repeat it, you have names like O'Reilly to test for, sometimes, Mcphillips is correct, depending on the preference of the user...For this reason I don't think that a module has been written yet. You can always make one.
(bbq) Re: Proper case for names
by BBQ (Curate) on Sep 01, 2000 at 18:13 UTC
    If you wanted to get closer to your objective, you should shoot for "Proper case for names in the English language". IMHO, this is not a problem to be solved by perl or programming altogether. Wouldn't this be better tackled by handling the your data-entry methods?

    #!/home/bbq/bin/perl
    # Trust no1!
      Your point is well taken. Yes, entering clean data is easier than cleaning it later.
      However, I'm dealing with a large established database, with over ten million names. Errors do creep in over time...
        If you want a good method, you could try using large database of names for comparison (the phone book comes to mind) to figure out what is the likely correct capitilization of a name.

        Many companies avoid this problem all-together. Most of the bills (and other postal mail) that I receive only list my name in all-caps.

Re: Proper case for names
by gnat (Beadle) on Sep 02, 2000 at 04:35 UTC
    If you're sanitizing a database, and you know that the vast majority of words are capitalized correctly, then the problem is easy to solve. Go through the database, and for each name generate the lowercase version. Keep track of how many differently-cased forms correspond to the one common lowercase form ("mckenzie" vs "McKenzie" vs "Mckenzie"). The ones that rarely occur are the mistakes, the ones that often occur are correct.

    You hope. :-)

    Nat

Re: Proper case for names
by fundflow (Chaplain) on Sep 01, 2000 at 18:02 UTC
    s/(\w+)/\u$1/g seems to work for most cases and if the input case is correct, it won't change it (i.e. McArthur remains McArthur)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://30699]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (3)
As of 2024-04-25 17:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found