in reply to Re^3: Unexpected output from my PERL program. WHAT is my problem???
in thread Unexpected output from my PERL program. WHAT is my problem???

ikegami: Thanks for all your help. That was all that I needed....those TWO LINES:

binmode(STDOUT, ':encoding(iso-8859-1)'); print $response->decode_content(default_charset => 'UTF-16be');


I don't know how I missed that the first time you posted it.
  • Comment on Re^4: Unexpected output from my PERL program. WHAT is my problem???
  • Download Code

Replies are listed 'Best First'.
Re^5: Unexpected output from my PERL program. WHAT is my problem???
by ikegami (Patriarch) on Nov 04, 2009 at 17:37 UTC

    Two things I meant to say:

    Since the perl program now returns text according to your C program's definition, you won't have problems with popen

    iso-8859-1 can only represent a small subset of the characters supported by UTF-16. The conversion to iso-8559-1 may result in information loss.

    I don't know how I missed that the first time you posted it.

    This is the first time I posted this. I needed to know what kind of data you had before I could post a solution. The solution is not a general solution. It would be inappropriate for XML, for example.

      iso-8859-1 can only represent a small subset of the characters supported by UTF-16. The conversion to iso-8559-1 may result in information loss.

      Just for future reference in case I run across this situation, is there a charset that can be used to circumvent this issue? Or am I stuck with the potential loss? Thanks.

        is there a charset that can be used to circumvent this issue?

        You mean "character encoding", not "character set". The question is:

        Is there a character encoding that can represent the Unicode character set?

        All UTF-* encodings can handle all Unicode characters.

        There's obviously something missing to the question since you started off with such a character encoding (UTF-16be).

        Also worth mentioning are the UCS-2* encodings. UCS-2le and UCS-2be are the fixed-width subsets of UTF-16le and UTF-16le. They can handle a big chunk of Unicode (U+0000..U+FFFF).

        Windows uses UCS-2le internally and uses this for its Wide interface. UTF-8 is the charset of choice elsewhere.

        In fact, unix terminals tend to expect UTF-8 these days. It kinda surprised me when you asked for iso-8859-1.

Re^5: Unexpected output from my PERL program. WHAT is my problem???
by ikegami (Patriarch) on Nov 04, 2009 at 17:39 UTC

    Two things I meant to say:

    Since the perl program now returns text according to your C program's definition, you won't have problems with popen

    iso-8859-1 can only represent a small subset of the characters supported by UTF-16. The conversion to iso-8559-1 may result in information loss.

    I don't know how I missed that the first time you posted it.

    This is the first time I posted this. I needed to know what kind of data you had before I could post a solution. The solution is not a general solution. It would be inappropriate for XML, for example.