perleager has asked for the wisdom of the Perl Monks concerning the following question:

Hey,

I have three fields in a MySQL db, Name, Date, and Hour.

Right now there is no last name field and now since im into more of this perl web data application, I want to start sorting by last name.

The db is currently has 5000+ data entries. What would be the best most efficient way to create a last name field with all the values according to the Name field.

such as:

Name Date Hour <br> Bob Jones 11/02/83 5<br> John Doe 11/03/83 4<br>


I was thinking about, putting all those names into an array, then use a foreach statement to split the names up so I could retrieve the last name. Once splitten up, put those values into a array, then use mysql commands to add a new column and insert the values accordingly.

Finished result (ex.):

Name Last Date Hour <br> Bob Jones Jones 11/02/83 5<br> John Doe Doe 11/03/83 4<br><br>
Anthony

Replies are listed 'Best First'.
Re: MySQL last name field
by VSarkiss (Monsignor) on Apr 21, 2003 at 15:56 UTC

    ...use a foreach statement to split the names up so I could retrieve the last name
    Don't underestimate the difficulty of this task. A simple-minded split on whitespace will probably get you no better than 75%. You may want to get help from a module like Lingua::EN::ParseNames.

Re: MySQL last name field
by Improv (Pilgrim) on Apr 21, 2003 at 15:42 UTC
    Your solution is basically sound, but there are some things worth noting:
    1. You really don't want to have a Name field if you have a last name field -- what if someone changes their name? It's safer to eventually drop the name column entirely, when you've split it to first and last names.
    2. Sometimes people's names arn't quite as easily parsed as that. What if someone has a first name of "Mary Anne" or a last name of "Simon Broquard"? There's not really a good way to deal with those situations except flag them and fix them by hand with domain-specific knowledge (e.g. ask them)
    3. You might be able to do this more efficiently by writing a stored procedure (not sure if MySQL supports them though), so that all that data won't need to flow woth ways from/to the database. OTOH, ~5000 data entries is perhaps not enough data for this to be a major concern
    So, in sum, don't split name to name and lastname, split it to firstname and lastname, and remove name. Be careful that your splits don't encounter oddities. I hope this helps.
Re: MySQL last name field
by dws (Chancellor) on Apr 21, 2003 at 17:24 UTC
    I was thinking about, putting all those names into an array, then use a foreach statement to split the names up so I could retrieve the last name.

    Depending on your data, that could work fine. Edge cases like "Jay Random, Jr." will throw that off, but if you only have 5000+ names, you can inspect and fix those up manually.

    A better scheme is to not assume that you can do manipulations like that on names, and to keep the "display" name (Bob Jones) separate from the "sort" name (Jones, Bob). All names aren't English, and the sorting rules for names are often non-obvious.

      Hey,

      I am aware of the name issue. If it was a english name, we entered it as Marry-Anne Keating. If it had "jr." in it it'll go as, Stephen Jones-Jr.

      Anthony