From an online source:
UTF-7 isn't a "Unicode Transformation Format", as the definition can only encode code points in the BMP. However if a UTF-7 translator is to/from UTF-16 then it can encode each surrogate half as though it was a 16-bit code point, and thus can encode all code points. It is unclear if other UTF-7 software support this. UTF-7 has never been an official standard of the Unicode Consortium. It is known to have security issues, which is why software has been changed to disable its use. It is prohibited in HTML 5.
However, UTF7 was never useful (to my knowledge) for Asian languages. It is not truly "unicode," and even trying to say it were "unicode compatible" would be an exaggeration. It does not, and I think cannot, support the full range of characters in the unicode tables of today (likely why it has become obsolete).
Perl's unicode is problematic for many reasons. The core modules are not all unicode compliant, much less all of the general run-of-the-mill modules to be found on CPAN. While it is possible to program one's own unicode-compliant code using Perl, perhaps including adapting others' modules or programming one's own, this is not the same thing as to say that Perl was already unicode compatible. As the dictionary definition indicates, being "compatible" means not requiring special adaptation or modification--something which cannot be said of Perl, yet, considering the gymnastics the average coder will go through to learn the ropes for enabling unicode in his or her code. When the module "Encode" was removed from core, the gymnastic routines started over again to learn the new ways to handle unicode.
Perl is certainly adaptable, and able to be adapted. But being able to be adapted is not the same as coming with those adaptations already built-in and ready to use. One cannot just say: print "$unicode_string\n"; and expect a beautiful output as if the text were not unicode.
The problem is that Perl seems to have its own standard for the language it works from, and it translates all input/output based on that standard. If it were possible to say something like "use unicode;" as a pragma at the beginning of one's script which would then induce Perl to consider ALL program input, output, and internals to be in the same language of unicode, then I would say it not only "supports" unicode, but is fully "compatible" with it. Unfortunately, this is still a dream, not a reality.
Yes, you can program for unicode with Perl--it is supported. But it is not easy, as any of us with extensive experience can tell you--and it requires that one's code be specially adapted to handle the unicode, thus failing of the dictionary definition of "compatible." If you choose to believe that "compatible" and "supported" are equivalent terms, fully synonymous, then so be it. We may need to agree to disagree, as I do not see them to be identical words, and my usage here follows my understanding of their separate meanings.
Blessings,
~Polyglot~
In reply to Re^7: Converting Unicode
by Polyglot
in thread Converting Unicode
by BernieC
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |