Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Changing from full name to last, first mid

by Anonymous Monk
on May 24, 2004 at 20:28 UTC ( [id://356040]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I have searched the archive for this wisdom without success. I thought I had a simple problem but don't seem to find help anywhere. I have a file of full names
example John Hand Brown Cindy Jones Thomas More
and I want to get change this to a list by last name and cannot seem to find how to do it anywhere
example Brown, John Hand Jones, Cindy More, Thomas
Can you help? gde

Edited by Chady -- added code tags.

Replies are listed 'Best First'.
Re: Changing from full name to last, first mid
by Old_Gray_Bear (Bishop) on May 24, 2004 at 21:29 UTC
    I have been down this long, painful road. First, look at Lingua::EN::NameParse, it can help. Second, consider the following names:
    • J R Jones
    • JR Jones (Yes, no vowel in the first name. I have an uncle who is WD Bascombe, III; it's that way on his birth-certificate, and no, his father was not WD, Jr....)
    • J. R. R. Tolkein
    • Inez dela Vega y Montoya
    • Louis de la Salle
    • Tiger (Single name, not first, not last, just 'name')
    • J. R. Jones, III
    sigh

    ----
    I Go Back to Sleep, Now.

    OGB

      luckly, my quest was not this complicated...The few Jr., and III could be easily handled and it was a one shot deal. Thank you for the glimpse into the abyss...
Re: Changing from full name to last, first mid
by fletcher_the_dog (Friar) on May 24, 2004 at 21:14 UTC
    From the command line you could just do:
    perl -p -i -e 's/^(.*?)\s*(\w+)\s*$/$2, $1\n/' list.txt
    Update This is making the assumption that the names are on different lines, if they are on the same line then you are going to have a very hard time determining where one name ends an another begins
      The s/ code did it...I guess I need to look more at regular expressions...thank you gde
        With your help I created my first "useful" perl program...I'm so happy. I have found inter peace. gde
Re: Changing from full name to last, first mid
by hardburn (Abbot) on May 24, 2004 at 20:46 UTC

    Just look at the data you have here:

    John Hand Brown Cindy Jones Thomas More

    Knowing nothing else about your data, I look at this and decide that there are three seperate names here, which are "John Hand Brown", "Cindy Jones", and "Thomas More". But it's quite possible that the names are actually "John Hand", "Brown Cindy", and "Jones Thomas More", or perhaps some other combination. Consider that the human brain is much, much better at solving ambiguity than computers are (or at least making a solution that is closer to reality).

    If the data above is repesentative of what you have, then I don't think you're going to find a solution with even an acceptable failure rate.

    ----
    send money to your kernel via the boot loader.. This and more wisdom available from Markov Hardburn.

Re: Changing from full name to last, first mid
by CountZero (Bishop) on May 24, 2004 at 20:33 UTC
    It would be helpful if you could show us the format of the file with the names. If it is just a long list of names, one after another, without any delimiters, I'm afraid there is no solution for your problem.

    CountZero

    "If you have four groups working on a compiler, you'll get a 4-pass compiler." - Conway's Law

Re: changing full name to last, first mid
by mifflin (Curate) on May 24, 2004 at 20:51 UTC
    Try this code...
    while (<DATA>) { @name = split /\s/; print pop(@name), ', ', join(' ', @name), "\n"; } __DATA__ John Hand Brown Cindy Jones Thomas More
    produces the output...

    Brown, John Hand
    Jones, Cindy
    More, Thomas
    

      Then you have someone like me who breaks your code. "Brendan Van Horn" should end up as "Van Horn, Brendan." Nothing against you, but I really get tired of people who think my middle name is Van.

      Owl looked at him, and wondered whether to push him off the tree; but, feeling that he could always do it afterwards, he tried once more to find out what they were talking about.

        Other gotchas:

        Prefixes for last name: De, Del, Dela, Di, Du, El, La, Le, Mac, Mc, San, St., Van, Vanden, Vander, Ver, Von, etc.
        Suffixes for last name: II, III, IV, Sr, Jr, MD, PhD, etc.

Re: Changing from full name to last, first mid
by davido (Cardinal) on May 25, 2004 at 05:25 UTC
    You have a challenege ahead of you, honestly.

    Consider the following names:

    Mike Brown => Easy.
    John Paul Williams => Easy too.
    Biff Mc Fly => This is harder.
    Peter David Van Den Berghe => Now what do you have in mind?
    Catherine Zeta-Jones => Got a rule for this one?

    The point is, what constitutes a last name? In one of the examples above, Mc Fly is the last name. In another example, Van Den Berghe is the last name. ...You would never say, "Hello Mr. Berghe" ... It's Mr. Van Den Berghe, always. Yet how are you going to come up with a hard fast set of rules that take into account all possible forms of last names?

    It's hard. You would almost need a known last-name lookup table to match against.


    Dave

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://356040]
Approved by Happy-the-monk
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (2)
As of 2024-04-20 05:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found