in reply to Dealing with Names

Hm, are you looking for something like this?

my @names = ( 'Daniel R Von Vanderschmidt', 'Daniel Von Vanderschmidt', 'Daniel De La Silvia', 'Daniel De Silvia', 'Daniel La Silvia', ); for my $name (@names) { my @comps = $name =~ m{(?:Von|De La|La).*|\w+}g; print "[$_]" for @comps; print "\n"; }

The output is:

[Daniel][R][Von Vanderschmidt] [Daniel][Von Vanderschmidt] [Daniel][De La Silvia] [Daniel][De][Silvia] [Daniel][La Silvia]

Update: Added the .* after re-reading your example output.

Replies are listed 'Best First'.
Re^2: Dealing with Names
by walkingthecow (Friar) on Aug 28, 2008 at 18:43 UTC
    I'm not trying to match them, I am trying to make anything with De La Whatever become one name in one element of the array.

    Say we have an array that looks like this:

    {Daniel}{De}{La}{Silva} (each bracket is an element)

    I am trying to make the array look like this:

    {Daniel}{De La Silva}{}{}

    However, say an array looks like this:

    {Daniel}{R}{Silva}

    then keep the array the way it is... only join those last names that have spaces (e.g. von Derfen).

      This is fun to play with but keep in mind that it cannot be solved perfectly. Von is not an entirely uncommon given/middle name, for example so Mark Von Shepard might correctly be {Mark}{Von}{Shepard} instead of {Mark}{Von Shepard}. If you don't get name data delimited correctly at the point of input, you're never going to reverse engineer it with 100% reliability.

      I think betterworld has given a pretty simple and elegant solution. Given the code you posted in Re^2: Dealing with Names, you could adapt betterworld's code like so:

      my $name_count = scalar split / /, $new_gecos; my @comps = $new_gecos =~ m{(?:Von|De La|La).*|\w+}g; push @comps, '' while scalar @comps < $name_count;

      The first and third lines are only there to give you the blank array elements you apparently want. I'm putting in an empty string there, but maybe you want undef. Season to taste.