Welcome to the administrative sub-basement of hell. Depending on the names your friend has to deal with, you might discover that a regex can handle 98%, but that the remaining 2% will cause you to run screaming into the night.
Consider my dear friend Lt. Col. J. Random von Perl-Hacker III By the scheme your friend is using, Randy's name needs to reduce to <Lt. Col.> <J.> <R.> <von Perl-Hacker> (And it isn't immediately clear what to do with the "III".) In any large set of unstructured names, you're going to run into a few like this. Good luck doing handling them with a single regexp.
I think you'll have better luck breaking the name into tokens, providing predicate functions that answer whether a token can be of a particular type, then providing a set of "rules" to match a set of tokens against. This will be slower, but potentially much more accurate, than a regex.
In reply to Re: regex: seperating parts of non-formatted names
by dws
in thread regex: seperating parts of non-formatted names
by emilford
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |