in reply to Match non-capitalized words at the beginning of each sentence

(updated node) Actually, ignore me and just read what Zaxo and %mick have to say.

Hmmmm, this will get all non-capitalized words at the beginning of each sentance except the very first sentance and store them into an array:

my $s = 'hello world! how are you? whoops, forgot.'; my @no_caps = $s =~ /[.?!]\s*((?![A-Z])[a-z]\w+)/g;
Better to replace them on the spot says me:
$s =~ s/^((?![A-Z])[a-z]\w+)/ucfirst$1/e; $s =~ s/([.?!]\s*)((?![A-Z])[a-z]\w+)/$1.ucfirst$2/eg;
The first regex gets the first word of the string, the second takes care of the rest. Putting this back into your original code we get:
use strict; @ARGV = '/Perl/LearningPerl/Test'; while(<>){ if (/^((?![A-Z])[a-z]\w+)/) { print "$1 is not capitalized\n"; } while (/([.?!]\s*)((?![A-Z])[a-z]\w+)/g) { print "$2 is not capitalized\n"; } }
And that's ugly. The first if catches the first word of the file, and the while loop takes care of the rest.

And it is still broken, as newlines are the monkeywrench in this machine. Taking Zaxo's suggestion of slurping the entire file into a scalar will fix that (the error, not the ugliness):

my $file = do {local $/; <>}; if ($file =~ /^((?![A-Z])[a-z]\w+)/) { print "$1 is not capitalized\n"; } while ($file =~ /([.?!]\s*)((?![A-Z])[a-z]\w+)/g) { print "$2 is not capitalized\n"; }
Sorry for being too quick to respond.

jeffa

I shoulda waited for merlyn ...
  • Comment on (jeffa) Re: Match non-capitalized words at the beginning of each sentence (was: Regular Expressions)
  • Select or Download Code