Come for the quick hacks, stay for the epiphanies. | |
PerlMonks |
Re^3: The “real length" of UTF8 stringsby moritz (Cardinal) |
on Sep 24, 2008 at 07:57 UTC ( [id://713369]=note: print w/replies, xml ) | Need Help?? |
Sure, but the Han script is probably about 40000 characters big: no way to write a list by hand. That's why my example queries each character for the Unicode property \p{Han}, ie if the character is in that script block. For a better description of Unicode properties and script blocks in Regexes I recommend "Mastering Regular Expressions" by Jeffrey Friedl, pages 121pp.
In Section
Seekers of Perl Wisdom
|
|