I was wondering if this question is related to last year's Split first and last names

No connection at all...
But bonus points for your memory recall and for joining similar dots!

any update on your learnings from your long name parsing journey

I did create an internal discussion document and went on to create a parser for dealing with name strings and splitting them up reasonably well. But, given that they are difficult to split with programmatically with certainty, we parse the string then show the user how we have split them allowing them to adjust as their superior human brain sees fit. Except where the names are a known firstname (looked up from a long list) and a single surname. The we don't show the parse results but the user can adjust of they think it's appropriate.

It is working well for the low volumes of traffic we currently have.

Our roadmap includes adding AI to this parsing process. It is something that AI should be as good as a human at doing. At least, nearly as good as a human. So far I have written a couple of prompts and fed the AI a variety of tricky names to split up into their component parts and the results look promising. The AI is formatting them nicely as JSON so we should be able to deal with the results.

It's parsed Johannes Adam Ferdinand Alois Josef Maria Marko d'Aviano Pius von und zu Liechtenstien and told me that I don't have enough fields to properly accommodate all the constituent parts. But I doubt the ruler of Liechtenstien will need us to store his name!


In reply to Splitting names revisited (was: Re^2: Capture Groups) by Bod
in thread Capture Groups by Bod

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.