in reply to Re^5: Unexpected output from my PERL program. WHAT is my problem???
in thread Unexpected output from my PERL program. WHAT is my problem???

iso-8859-1 can only represent a small subset of the characters supported by UTF-16. The conversion to iso-8559-1 may result in information loss.

Just for future reference in case I run across this situation, is there a charset that can be used to circumvent this issue? Or am I stuck with the potential loss? Thanks.
  • Comment on Re^6: Unexpected output from my PERL program. WHAT is my problem???

Replies are listed 'Best First'.
Re^7: Unexpected output from my PERL program. WHAT is my problem???
by ikegami (Patriarch) on Nov 04, 2009 at 18:02 UTC

    is there a charset that can be used to circumvent this issue?

    You mean "character encoding", not "character set". The question is:

    Is there a character encoding that can represent the Unicode character set?

    All UTF-* encodings can handle all Unicode characters.

    There's obviously something missing to the question since you started off with such a character encoding (UTF-16be).

    Also worth mentioning are the UCS-2* encodings. UCS-2le and UCS-2be are the fixed-width subsets of UTF-16le and UTF-16le. They can handle a big chunk of Unicode (U+0000..U+FFFF).

    Windows uses UCS-2le internally and uses this for its Wide interface. UTF-8 is the charset of choice elsewhere.

    In fact, unix terminals tend to expect UTF-8 these days. It kinda surprised me when you asked for iso-8859-1.