Polyglot has asked for the wisdom of the Perl Monks concerning the following question:
\p{InThai} | \p{InLao}
However, these languages have, for example, three classes of consonants: high, middle, and low. These classes, in conjunction with tone marks, determine the tone of each syllable, and play an important role in determining the boundaries for words and syllables (these languages do not space-delimit words).
The unicode documentation I found on some of the packages on CPAN mentioned that the programmers did not know Thai, and could not, therefore, do much of usefulness with it.
I'm trying to create, then, a set of classes for these characters, but have never created a perl package before, only used them. The Perl book I have was vague in how to define the subroutines for this, and has left me confused.
I'm not asking for anyone to create the package. I only ask for someone to point me in the right direction. If we were to assume that x = high class, y = low class, and z = middle class, could anyone give me an example of how I might make a package that could be used something like this:
use Thai;
$line =~ m/\p{ThaiHighClass}\p{ThaiLowClass}/;
Naturally, I would like to define much more than consonant classes for these languages, but if I could do just this much, the rest would be easy to add.
Help will be much appreciated, and success may mean an addition to CPAN.
Blessings,
Polyglot
|
|---|