I'm validating some mixed English and Japanese utf-8 input . It sometimes contains a-z A-Z 0-9 entered not only from the common ascii compatible unicode range, but also this unicode range xFF10 - xFF5E
http://en.wikibooks.org/wiki/Unicode/Character_reference/F000-FFFF
for example
A (unicode x0041)
A (unicode xFF21 http://www.decodeunicode.org/u+FF21)
I understanding that to be safe I need to interpret unicode characters I accept only as their smallest unicode representation
e.g interpret xFF21 as x0041 (as in above)
So question is, can I use some function/module of Perl to do this, or do I have to manually convert them with a mapping. All the experimenting I've done so far, it seems like I'll have to manually do it. This surprise me if I is supposed to interpret them in their smaller representation.
cheers for any feedback, sorry for my english
damianIn reply to validating unicode chars in their smallest form by damian45
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |