in reply to Re^2: CSV nightmare
in thread CSV nightmare

To me, there are two important differences between UCS-2 and UTF-16.

The first important difference is that UCS-2 can only represent U+0000 to U+FFFF, whereas UTF-16 can represent any UNICODE character.

The second important difference is the number of bytes UCS-2 and UTF-16 use to store a character. Each UCS-2 character is exactly 16 bits in size, whereas UTF-16 is like UTF-8. Some characters require more than one word.

for output, byte order is determined by the cpu

No. I'm on an x86 (LE machine), but UTF-16be was used.

for input, byte order is determined by a stream-initial BOM (if the BOM isn't there, perl complains about it; if it is there, perl does not remove it for you).

No. Perl *does* remove it for you, just like it adds it for you for output.