pbassnote has asked for the wisdom of the Perl Monks concerning the following question:

I am working with a csv file of several fields of data, but I only need help with re-formatting a name field. The name field is a variable-length, slash-separated (/) data that can be in any of these various formats:

George Washington/GOV/VA/US John Adams/GOV/MA/US John Q Adams/GOV/MA/US John Q Adams Jr/GOV/MA/US John Q Adams JR/NonEmployee/GOV/MA/US Thomas Jefferson/GOV/VA/US

As you can see, most of these lines have a name followed by three slash-separated bits of data, but some will have more. All I need from this is the name part of the data. Everything from the first "/", including the slash character isn't needed.

With the name, I need to re-format into two fields for last name, and first name, to put into a csv file, so that the above data needs to come out as:

Washington, George Adams, John Adams, John Q Adams Jr, John Q etc.

I've considered using some kind of split() statement to separate the name field by the slashes, but I don't know how to do that? Also, how to reverse the names to display last name before first name?

Thanks!

Replies are listed 'Best First'.
Re: How to re-format a name field
by toolic (Bishop) on Sep 30, 2014 at 18:50 UTC
    You could use a regex instead of split, and you could try Lingua::EN::NameParse:
    use warnings; use strict; use Lingua::EN::NameParse qw(); my $p = Lingua::EN::NameParse->new(); while (<DATA>) { my ($name) = $_ =~ m{([^/]+)/}; if ($p->parse($name)) { print "Error: $name\n"; } else { my %name_comps = $p->components(); printf "%s %s, %s %s\n", @name_comps{qw(surname_1 suffix given +_name_1 initials_1)}; } } __DATA__ George Washington/GOV/VA/US John Adams/GOV/MA/US John Q Adams/GOV/MA/US John Q Adams Jr/GOV/MA/US John Q Adams JR/NonEmployee/GOV/MA/US Thomas Jefferson/GOV/VA/US Martin Van Buren/GOV/VA/US

    Outputs:

    Washington , George Adams , John Adams , John Q Adams Jr, John Q Adams JR, John Q Jefferson , Thomas Van Buren , Martin
Re: How to re-format a name field
by davido (Cardinal) on Sep 30, 2014 at 18:30 UTC

    Sounds easy until you encounter "John Van Komen" or "Jo Ellen de Lourdes Van Den Berghe" and suddenly split someone's last name in half while at the same time losing part of their first name. Defining the rules correctly will be hard, for reasons unrelated to programming.


    Dave

Re: How to re-format a name field
by MidLifeXis (Monsignor) on Sep 30, 2014 at 19:59 UTC

    Various regions of the world treat surnames and given names differently. Add also the case of multiple middle names, multiple last names, hyphenated or not, titles, suffixes, and other human issues associated with names, and you are in for quite a ride.

    --MidLifeXis