in reply to Get Vowels from sentence
The next statement will extract all the vowels and save each one as a separate element in the @vowels array ("y" is not counted as a vowel in this case):$_ = <<SENTENCE; There once was a doggy in the window. I went into the store and inquired about the price, the doggy was too nice to ever think twice, so now I have a doggy to care for. SENTENCE
I get 53 vowels, total, counting both upper- and lowe-case (because of the "i" modifier after the regex). If you wanted to count "y" when it functions as a vowel (i.e. when it is not followed by another vowel), the regex would be like this:my @vowels = ( /[aeiuo]/gi );
I get 56 vowels that way. The "(?!...)" part is called a "zero-width negative-look-ahead assertion" (sounds scary, eh?), and you can look it up in the perlre man page.my @vowels = ( /[aeiou]|y(?![aeiou])/gi );
By putting the "g" modifier on the regex ("match globally" -- i.e. find all occurrences of the pattern), and putting the whole thing in a list context (assigning to an array), all the matches are captured as array elements.
update: Some "grammar police" might insist that "y" can only function as a vowel when there is no other vowel either before or after it -- e.g. "y" is not a vowel in "clay" (just as "w" is not a vowel in "claw"). Still, it seems like it has to be a vowel in cases like "rhythm". Those who hold this position would make the regex like this:
using "(?<!...)" -- the "zero-width negative-look-behind assertion". The "zero-width" feature means that they are not counted as part of the matched string -- the regex still only matches (and captures) a single character at a time, which is either one of "a e i o u" or else a "y" that satifies the zero-width assertions.my @vowels = ( /[aeiou]|(?<![aeiou])y(?![aeiou])/gi );
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Get Vowels from sentence
by tiny_tim (Sexton) on Jan 04, 2007 at 22:43 UTC | |
|
Re^2: Get Vowels from sentence
by meappy (Initiate) on Jul 19, 2010 at 16:42 UTC | |
by graff (Chancellor) on Jul 22, 2010 at 05:31 UTC |