texuser74 has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

Like lc($string) producing lowercase string...

Is there a way to produce formal case. i.e. something like

'Proper Name'

I mean only the first character of a word alone in uppercase and rest in lowercase.

Replies are listed 'Best First'.
Re: change case
by davido (Cardinal) on Nov 26, 2003 at 03:05 UTC
    You want the ucfirst function, possibly applied word by word.

    Lets say you have a string where you want to ucfirst each word. You could do that with the following:

    $string = s/\b(\w+)\b/ucfirst($1)/eg;

    You could modify the above regexp to handle apostrophes and hyphens by using negative lookbehind:

    $string = s/(?<![-'])\b(\w+)\b/ucfirst($1)/eg;

    The above substitution regex will apply ucfirst to any cluster of "word" characters as long as they're not immediately preceeded by an apostrophe or a hyphen. That way you don't end up with Frank'S Place when you really want Frank's Place.

    Negative lookbehind with a character class is the right tool, as opposed to positive lookbehind for a negated character class, because you only care that an apostrophe or hyphen doesn't appear before the 'word' cluster. A negated character class inside of a positive lookbehind would require that a "non-apostrophe, non-hyphen" exist in front of the word cluster, which is going to foul everything up. You don't want that. ;)


    Dave


    "If I had my life to live over again, I'd be a plumber." -- Albert Einstein

      You can skip the /e modifier by using the \u escape. And you don't need to capture the whole word; the first letter will do.

      perl -ple 's/\b(?<![-'])(\w)/\u$1/g'

      -sauoq
      "My two cents aren't worth a dime.";
      
Re: change case
by Roger (Parson) on Nov 26, 2003 at 03:44 UTC
    Or you could use a single regular expression -
    #!/usr/local/bin/perl -w use strict; while (<DATA>) { s/(\w+)/\u\L$1/g; print "$_"; } # or just a one liner - # s/(\w+)/\u\L$1/g, print for <DATA>; __DATA__ foo bar ned KelLy james bond
    And the output is -
    Foo Bar Ned Kelly James Bond
Re: change case
by jweed (Chaplain) on Nov 26, 2003 at 03:06 UTC

    What you want is called title case.

    Unfortunately, the function associated with it, ucfirst, only does its magic on the first letter of the string, not the first letter of each word.

    Try

    join ' ' , map ucfirst, split ' '
    to do what you want.

    Like with upper case, the character escape for title case is \u. It has the same problem as ucfirst above.

    Edit:

    Never EVER use tr/a-z/A-Z/ for any uppercasing stuff. It doesn't play well with unicode. Not that you were going to, but it's always good to check.

    And it looks like davido beat me. Oh well. I must note, though, that I think my solution is faster if the string is especially long, espectially if the words are deliminated by a single space. Then no regex work needs to be done. ;)


    Who is Kayser Söze?
      I like your approach too though.

      So we have a s/// (substitution) method, and a join / split / map method. As you mentioned, tr/// isn't a good candidate, but I couldn't resist at least tossing out there the thought of tr/// applied to the first letter of words by using pos along with m/\b[a-z]/g (to locate position of first letters of words) along with substr (as an lvalue to constrain the effects of tr///). ...it could work, but sounds like madness. ;)

      TI(always)MTOWTDI


      Dave


      "If I had my life to live over again, I'd be a plumber." -- Albert Einstein
Re: change case
by Abigail-II (Bishop) on Nov 26, 2003 at 08:05 UTC