in reply to simpler regex

Not well tested, but

$_ = 'A A Jones'; s[(?<=[A-Z])(?=\s)][.]g; print;; A. A. Jones $_ = 'Bob J Smith'; s[(?<=[A-Z])(?=\s)][.]g; print;; Bob J. Smith $_ = 'Dr P J van Houten'; s[(?<=[A-Z])(?=\s)][.]g; print;; Dr P. J. van Houten

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

Replies are listed 'Best First'.
Re^2: simpler regex
by johngg (Canon) on May 10, 2007 at 09:01 UTC
    I think that's going to start doing the wrong thing if Dr van Houten starts putting letters after his name; the first set are ok but as you start adding more you get unwanted dots.

    $ perl -le '$_ = q{Dr P J van Houten MD}; > s[(?<=[A-Z])(?=\s)][.]g; > print;' Dr P. J. van Houten MD $ perl -le '$_ = q{Dr P J van Houten MD FRCS}; > s[(?<=[A-Z])(?=\s)][.]g; > print;' Dr P. J. van Houten MD. FRCS $

    A possible solution is to use alternation of two look-behinds.

    $ perl -le '$_ = q{Dr P J van Houten MD FRCS}; > s{(?:(?<=\A[A-Z])|(?<=\s[A-Z]))(?=\s)}{.}g; > print;' Dr P. J. van Houten MD FRCS $

    Cheers,

    JohnGG