in reply to Re^5: Namespace/advice for new CPAN modules for Thai & Lao ( Regexp::CharProps - User Defined Character Properties )
in thread Namespace/advice for new CPAN modules for Thai & Lao
Yes. It definitely wouldn't work on an upper-ascii-type encoding such as Thai originally began with, without some form of encoding/decoding going on. I guess I put "UTF8" because that is what gets used most with Thai, and what I knew would work having developed strictly with that. I presume any Unicode type should work equally well, though I don't claim to be an expert on Unicode.
In your code example:
...only the first item in the OR'ed list should ever see action. All of the subsequent categories are already "InThai", and the "InThai" token already comes standard with Perl, AFAIK (see pg. 172 of "Programming Perl, Third Edition"), so that code would do little to test additional functionality. If the first line (\p{InThai}) failed, none of the others should succeed either.print "\$_ has got Thai" if m{ \p{InThai} |\p{InThaiCons} |\p{InThaiHCons} |\p{InThaiMCons} |\p{InThaiLCons} |\p{InThaiVowel} |\p{InThaiPreVowel} |\p{InThaiPostVowel} |\p{InThaiCompVowel} |\p{InThaiDigit} |\p{InThaiTone} |\p{InThaiPunct} }x;
NOTE: I've updated my list to reflect your proposed name, but I've adapted it slightly to one that seems a better fit to me.
Blessings,
~Polyglot~
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^7: Namespace/advice for new CPAN modules for Thai & Lao ( Regexp::CharClasses::Thai / Lingua::Thai::RegexpCharClasses )
by Anonymous Monk on Mar 24, 2015 at 08:03 UTC | |
by Polyglot (Chaplain) on Mar 24, 2015 at 09:32 UTC | |
by Polyglot (Chaplain) on Mar 24, 2015 at 13:22 UTC | |
by Anonymous Monk on Mar 25, 2015 at 00:25 UTC |