I was wondering if this question is related to last year's Split first and last names
No connection at all...
But bonus points for your memory recall and for joining similar dots!
any update on your learnings from your long name parsing journey
I did create an internal discussion document and went on to create a parser for dealing with name strings and splitting them up reasonably well. But, given that they are difficult to split with programmatically with certainty, we parse the string then show the user how we have split them allowing them to adjust as their superior human brain sees fit. Except where the names are a known firstname (looked up from a long list) and a single surname. The we don't show the parse results but the user can adjust of they think it's appropriate.
It is working well for the low volumes of traffic we currently have.
Our roadmap includes adding AI to this parsing process. It is something that AI should be as good as a human at doing. At least, nearly as good as a human. So far I have written a couple of prompts and fed the AI a variety of tricky names to split up into their component parts and the results look promising. The AI is formatting them nicely as JSON so we should be able to deal with the results.
It's parsed Johannes Adam Ferdinand Alois Josef Maria Marko d'Aviano Pius von und zu Liechtenstien and told me that I don't have enough fields to properly accommodate all the constituent parts. But I doubt the ruler of Liechtenstien will need us to store his name!
|