Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Given a string, is there an easy way to capitalise the first letter, then the first "word" letter after each "." or "!"?

Thanks

Replies are listed 'Best First'.
RE: How do i capitalise sentences?
by turnstep (Parson) on Jul 20, 2000 at 14:50 UTC
    $punc = "?!."; ucfirst($string); $string =~ s#([$punc][^a-z]*[a-z])#uc($1)#eg; ## or $string =~ s#([$punc]\s*[a-z]#uc($1)#eg;

    The first meets the strict definition of the requirements, but also causes
    "Who are you? 6 is my number"
    to become
    "Who are you? 6 Is my number"
    which may or may not be desired. If not, the second one only allows whitespace to be skipped between the end of a sentence and the start of a word.

    I used a-z instead of \w because otherwise perl will try to "uppercase" an underscore, which is not what we want. It's also safe to put the whole thing in a single paren, and to uppercase the whole thing, since the only part able to be uppercased is the part we *want* uppercased. I also removed the specification of what constitutes the end of a sentence outside of the regular expression, which makes it a little easier to read, and also allows easier changes in the future without accidentally breaking your regular expression.

    The usual warning vis a vis use English; and \w apply, of course. :)

Re: How do i capitalise sentences?
by lhoward (Vicar) on Jul 20, 2000 at 07:51 UTC
    my $d=ucfirst("sample. data some! will be. captilized"); $d=$s/([\!\.]\s*)(\w)/$1\U$2\U/g;
    ucfirst capitilizes the first character of a string capitilized, the regular expression takes care of the rest. My code above would not behave properly if there were a lower-case abbreviation, etc...
Re: How do i capitalise sentences?
by visnu (Sexton) on Jul 20, 2000 at 07:55 UTC
    while (<>) { $_ = ucfirst; s/([!.?])(\s+)(\w)/$1$2\U$3/g; print; }
    revision: lhoward's right in that it should be \s* instead of \s+.
Re: How do i capitalise sentences?
by japhy (Canon) on Jul 20, 2000 at 17:07 UTC
    It's not totally simple, considering there are cases where periods appear in the middle of a sentence. Unless you start parsing the entire English language, though, you'll have to deal with occasional foul-ups.
    $string =~ s/((?:^\s*|[!?.]+\s+)[^a-zA-Z0-9]*)([a-z])/$1\u$2/g;