That is a sentiment I have a lot of sympathy for, but then, would it not be better to stick to ASCII (7 bits)?
| [reply] |
No, my boss requires every source code to be strictly in japanese
| [reply] |
No, my boss requires every source code to be strictly in japanese
Does your boss also require that all source code files be strictly in a single character encoding scheme of the Unicode coded character set? If she doesn't, she should. In the case of computer programs, the character encoding of the source code file is as important to the computer as the natural language is to the programmer (and to the programmer's boss).
Think of your problem as two-fold. Firstly, you have a text file character encoding conformance problem. What do you do in your programming environment to ensure that all source code files for all projects are in the same coded character set (e.g., Unicode) and character encoding scheme (e.g., UTF-8)? What discipline do you impose on your programming team to ensure that, for example, no programmer inadvertently creates a source code file in the Shift-JIS character encoding? You should apply some rigor to enforcing that all source code files are in the UTF-8 CES of the Unicode CCS, and that they always include the Unicode byte order mark in them.
Secondly, you have a Perl multiple source code file inclusion à la do() problem. But when you've solved the first, more fundamental character encoding conformance problem in the way I've suggested, you've also solved this second, more coincidental problem.
(N.B. The memory footprint of the Unicode byte order mark is quite small.)
Jim
| [reply] [d/l] |
>perl -MO=Concise,-exec -e"use utf8; $x='abc';"
1 <0> enter
2 <;> nextstate(main 7 -e:1) v:U,{
3 <$> const[PV "abc"] s
4 <#> gvsv[*x] s
5 <2> sassign vKS/2
6 <@> leave[1 ref] vKP/REFC
-e syntax OK
It has a compile-time effect.
| [reply] [d/l] [select] |