I'm updating one of my scripts so that it's language packable. Swapping all the outputted text for variables that are stored in a separate file that's required in. So translations of this separate language file are all that's needed for the output to be a new language.
I'm happy with that, but I haven't worked with Unicode before and I'm worried and foreign symbol languages such as Chinese and Japanese. Will it be as simple as having
at the top of the language file? Will this mess up all my regexps or sprintfs? What about handling these languages as input from HTML forms?
Any advice welcome, thanks.