http://qs1969.pair.com?node_id=213144

perleager has asked for the wisdom of the Perl Monks concerning the following question:

Hello:

I have data that I want to sort by alphabetical order. I want two lists, A-K list and a L-Z list.

The variable I will be sorting is the Last name.

I have a data looks like this:

John Smith 443-333
Alan Jones 242-344
Henry Love 245-333
etc...

My data is stored in MySQL and is in the column "Name". How would I sort this by last name? Should I split the variable from the array, then add the last name string to a new array, then sort that array?

Thanks,
Anthony

Replies are listed 'Best First'.
Re: MySQL Aphlabetical order with DBI
by Asim (Hermit) on Nov 15, 2002 at 13:47 UTC

    Rather than muck with splitting the name, I suggest you split the data in the database into First and Last Name columns, and use ORDER BY to retrieve results as you need them. This sort of sorting is why you use a database to begin with, as it is more efficent than a program of any sort.

    If you really need to do a split, the Lingua::EN::NameParse module is one of a number designed to do the hard work of parsing people's names correctly.

    ----Asim, known to some as Woodrow.

Re: MySQL Alphabetical order with DBI
by UnderMine (Friar) on Nov 15, 2002 at 14:05 UTC
    Do it in the database....
    SELECT Name, RIGHT(Name,LENGTH(Name)-LOCATE(' ',Name)) as sortorder FROM Addresses ORDER BY RIGHT(Name,LENGTH(Name)-LOCATE(' ',Name))
    Better still store your data in a more usable format ;) Thanks UnderMine
Re: MySQL Alphabetical order with DBI
by earthboundmisfit (Chaplain) on Nov 15, 2002 at 13:56 UTC
    You could do that, splitting on the space in the name column and then running a sort in Perl, but it's going to lead to all sorts of problems if/when variant names crop up. What happens if you have a name like John Jacob Ezekiel Smith.....?

    Personally, I would restructure the database and split the column into consituents. FirstName (varchar 50), LastName (varchar 50), pageref (??). That way, you can use SQL to perform the sort (e.g. SELECT * FROM authors ORDER BY LastName) and eliminate the possibility of sorting on the wrong name. You would also see a speed performance increase doing it this way.

    update: a little slow on the keyboard this morning. Asim makes a good point about Lingua::EN::NameParse

Re: MySQL Alphabetical order with DBI
by schumi (Hermit) on Nov 15, 2002 at 15:16 UTC
    I like UnderMine's idea, but this only works assuming that there's no double last name with a space between them in the database. If, for example, you have a weird name like "John de Clerk", then the "de"-bit would be regarded as a part of the first name.

    Even worse, when someone has a name like "Sandra Meier Schenk". With Swiss people, this could very well occur, with "Sandra" being the first name, "Meier" being the woman's maiden, and "Schenk" the last name of her husband. According to Swiss law, this is perfectly possible and these days even quite frequent. (I won't go into the possible weird situations to which that might lead, let your own imagination run wild.... :-)

    Now, I don't know what you intend to do with the data once you've extracted it. If it's just about sorting the stuff, then this might do.

    If, however, you want to do something more with the data, even if it's at a later stage, then the only way to go is to re-structure your database. This helps you with your immediate problem, and also puts your data in a form which, should a similar need turn up again, makes the problem much easier. To achieve this, apart from doing it by hand (which is hardly ever an option, unless you only have five entries), the afore-mentioned Lingua::EN::NameParse is probably best.

    --cs

    There are nights when the wolves are silent and only the moon howls. - George Carlin

      (This may be a bit tangential to the specific question, but I don't think it's entirely off-topic...)

      I can attest to having encountered the multiple-last-name problem firsthand. Because my parents were uncreative (just kidding, mom & dad :-), they decided to give me the same first and middle names as my father, which were the same as his father's, and his father's... so my legal name actually has an "IV" (Roman numeral 4) suffix on it. Every now and then I'll do something like withdraw money from an ATM and have it display something like "Thank you, Mr. IV".

      Chinese names present an altogether different problem, because in Chinese the last name comes first. This means that Mao Zedong, for example, is (well, was) Mr. Mao, not Mr. Zedong or Mr. Dong.

      What's my point here? That name parsing is another of those tasks in which the naive algorithm (split on spaces) can miss a lot of subtle cases. If the data set is not trivial and you want to get it right -- which is a laudable goal, as many people get annoyed when their names are mangled -- using something like Lingua::EN::NameParse is probably smart. (It won't handle Chinese names as far I can tell, but then it does live under Lingua::EN...)

              $perlmonks{seattlejohn} = 'John Clyman';

      I think it should work in all the above cases as it is the equivalent of s/^.*?\s(.*)$/$1/ but it will fail on multiple firstnames or titles.

      Dr. John Smith
      A N Other

      You normally want to store :- Title, Firstname, Surname, Other Names/Initials and any other data. I have even used a seperator in the name like : if I can not change the database structure for some reason.

      Hope this helps

      UnderMine