in reply to Split first and last names

G'day Bod,

I agree with others that you should change the form. Ask specifically for first name and last name.

"The obvious problem is that it fails with extended characters such as Zoë."

Take a look at perlrecharclass and follow links from there.

This code is not intended as a solution to your problem; it's just to demonstrate some options that are available:

$ perl -Mstrict -Mwarnings -Mutf8 -C -E '
    my $n = "Zoë Åcçéñt-Smythe";
    my ($f, undef, $l)
        = $n =~ /([[:alpha:]]+)( +|\Z)([\p{Alpha}\p{Punct}]*)/;
    say $f;
    say $l;
'
Zoë
Åcçéñt-Smythe

— Ken

Replies are listed 'Best First'.
Re^2: Split first and last names
by Bod (Parson) on Nov 12, 2022 at 23:29 UTC

    Thanks Ken for the helpful code sample that I will muse over

    With regards changing the form, that won't be happening as explained in Re^2: Split first and last names

      "Thanks Ken for the helpful code ..."

      You're welcome.

      "With regards changing the form, that won't be happening as explained in Re^2: Split first and last names"

      That's new information (only posted an hour or so ago) but does add some clarity.

      "... full name (... envelope) ... firstname ... salutation."

      Perhaps something along these lines:

      #!/usr/bin/env perl use strict; use warnings; use utf8; use open OUT => qw{:encoding(UTF-8) :std}; my @names = ("Zoë", "Zoë Åcçéñt-Smythe"); my $re = qr{(?x: ^ ( ( [\p{Alpha}'_-]+ ) [\s\p{Alpha}'_-]* ) $ )}; for my $name (@names) { my ($full, $first) = $name =~ $re; print "Name: $name\n"; print "First: $first\n"; print "Full: $full\n"; }

      Output:

      Name: Zoë First: Zoë Full: Zoë Name: Zoë Åcçéñt-Smythe First: Zoë Full: Zoë Åcçéñt-Smythe

      Note that I allowed three punctuation characters ('_-); alter as necessary. I know, from earlier posts, that you're across SQL injection issues. Be aware, that between reading data from the web and supplying it to SQL, there may be other code injection issues. Without knowing anything more about your code, that's something you'll need to assess for yourself: I didn't include any validation; but you should.

      "As for the database storage..."

      I looks like most of that would be covered by "If we need extra information ... we ask for it ...". The majority wouldn't be covered by user input anyway (e.g. nickname, preferred names). Again, something for you to determine using the same principles as above (i.e. limited regex capture, code injection & validation).

      — Ken