in reply to Re: Vowel search
in thread Vowel search

Of course, this doesn't handle the conditional nature of y as a vowel.

In how many ordinary words is y one of a pair of two consecutive vowels?

I assume you have plans to write a machine learning script to train against a dictionary so it can develop heuristics for resolving the ambiguity.

I assume the novice Perl programmer with the PerlMonks username Noob@Perl (noob == newbie == neophyte) has no such plans.

Replies are listed 'Best First'.
Re^3: Vowel search
by roboticus (Chancellor) on Jun 12, 2014 at 15:27 UTC

    Jim:

    Hey, guy, today a bit of playing with my grey matter suggests they may be fairly common.... ;^D

    roboticus@sparky:~$ grep -i -E 'y[aeiou]|[aeiou]y' /usr/share/dict/ame +rican-english | wc -l 3244 roboticus@sparky:~$ wc -l /usr/share/dict/american-english 99171 /usr/share/dict/american-english

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

      Very few of those 3,244 words have a y in them that is one of a pair of vowels. You've included all the consonant y's and silent y's in your count.

        Jim:

        Yes, that's true, but I'm a bit too lazy to write the regular expression that separates the vowel, consonant and silent versions. However, I think it's a reasonable compromise, given that my quick and dirty scan missed:

        • The various words with a vowel-form of w,
        • The words in which the y is missing entirely, and
        • The words where the dictionary uses the wrong letter in place of the y.

        Pending evidence to the contrary, I expect that the false inclusions are roughly balanced by the missing entries.

        </joke>

        ...roboticus

        When your only tool is a hammer, all problems look like your thumb.

Re^3: Vowel search
by kennethk (Abbot) on Jun 12, 2014 at 15:19 UTC

    How many words in English are ordinary? And how does one define a vowel? In the word thigh, the vowels are i, g, and h. That's certainly more than two consecutive vowel characters, though it's only one vowel sound.

    You'll have to excuse a poor attempt at humor, attempting to illustrate how poorly constrained the spec is in actuality, and trying to highlight a distinct lack of effort on what strongly resembles a homework assignment.


    #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

      I realize you were tacitly chastising the OP for what you thought was a poor post. I don't agree that it was all that bad. I think the OP is earnest and has demonstrated a genuine interest in learning Perl. He or she just seems daunted by the basics. And after all, the OP did let us know that he or she is a Perl novice by his or her choice of PerlMonk username.

      I was tacitly chastising you right back for the glaring omission in your self-described "poor attempt at humor." You picked on the infrequent case of y's that are one of a pair of consecutive vowels, but you completely missed the case of all vowels with diacritical marks. What about them? What about the possibility of input text in different character encodings, both "legacy" and Unicode? What about Unicode combining characters and Unicode normalization forms? There's much more to say about the definition of "vowel" as any code point that matches the trivial regular expression pattern [AaEeIiOoUu] than what you wrote tauntingly about it in your reply to the OP.