Perl Newby has asked for the wisdom of the Perl Monks concerning the following question:

I have a text file that contains data that is pipe delimited. Below is an example:
4874|Kent Bottenfield|4|2|3.94|16.0|15|7|7|3|13|1| 5868|Nelson Cruz|7|0|4.00|9.0|10|5|4|1|5|0|0|0|0|0|2| 6111|Octavio Dotel|2|2|4.91|11.0|12|7|6|7|9|1|1|0|0| 6030|Scott Elarton|4|4|5.09|23.0|21|13|13|6|21|3|1|0| 6534|Wayne Franklin|7|0|6.43|7.0|11|5|5|3|4|0|0|1|0|0|
Right now I am able to parse all of the data out of the text file. Then have it outputted to an HTML format. I want to be able to take the second column and put it in this format ex. K. Bottenfield, instead of taking the full name. Right now I can only take the full name. I want to be able to just take the first letter in the first name and put a period after it, then take the last name. Any suggestions?

Replies are listed 'Best First'.
Re: Splitting Data Out of a Text File
by arturo (Vicar) on May 08, 2001 at 02:42 UTC

    It's hard to give a really robust solution here without knowing what forms the data can take. First, do you always have Firstname Lastname, or do some players have middle names? Or suppose bell hooks or e.e. cummings starts to play baseball, you'll have to deal with case. But here's how you might do it. I assume you're getting the lines out as an array, and the second element is the player's name.

    # split line into @data, or whatever $data[1] =~ s/^([A-Z])\w*( \w+)$/$1.$2/g;

    What that does is grab the first capital letter of $data[1] and remembers it in $1; then it matches a whole bunch of other letters (which are going to get thrown away), followed by a space and the last string of letters in the string (which it remembers as $2).

    It then substitutes that whole thing for the first thing it remembered, followed by the period, then the second thing it remembered.

    See perldoc perlre for more ways to have fun with regular expressions.

    Note: this is only one way to do it =)

    HTH

Re: Splitting Data Out of a Text File
by Hot Pastrami (Monk) on May 08, 2001 at 02:38 UTC
    Maybe this:
    $fullname =~ /^(.)\S*\s(.+)/; $fullname = "$1. $2";


    Hot Pastrami
(OT) Re: Splitting Data Out of a Text File
by ChemBoy (Priest) on May 08, 2001 at 03:02 UTC

    I believe the technical aspects of the question are adequately addressed above--I'd go for the capital letters character class, personally.

    That said, you're going to have trouble with this, one way or another--bell hooks doesn't play in the National League, but J.T. Snow does, as do Bobby Jones and Bobby Jones, not to mention two each of C. Jones and T. Jones (okay, one of those plays in the AL...). Martínez will give you similar issues, and Rodríguez is a mild pain as well (as is Smith, while we're at it).

    Sure you can't deal with the long names?

    Of course, I don't think any of those pairs play for the same team these days, so you're pretty safe if you stick to box scores. But I'd still be nervous doing it.



    If God had meant us to fly, he would *never* have give us the railroads.
        --Michael Flanders

Re: Splitting Data Out of a Text File
by lhoward (Vicar) on May 08, 2001 at 16:25 UTC
    As mentioned above, parsing names is hard. You may want to look at the Lingua::EN::NameParse module for assistance.