in reply to Re^4: Unexpected output from my PERL program. WHAT is my problem???
in thread Unexpected output from my PERL program. WHAT is my problem???

Two things I meant to say:

Since the perl program now returns text according to your C program's definition, you won't have problems with popen

iso-8859-1 can only represent a small subset of the characters supported by UTF-16. The conversion to iso-8559-1 may result in information loss.

I don't know how I missed that the first time you posted it.

This is the first time I posted this. I needed to know what kind of data you had before I could post a solution. The solution is not a general solution. It would be inappropriate for XML, for example.

  • Comment on Re^5: Unexpected output from my PERL program. WHAT is my problem???
  • Download Code

Replies are listed 'Best First'.
Re^6: Unexpected output from my PERL program. WHAT is my problem???
by URAvgDeveloper101 (Novice) on Nov 04, 2009 at 17:44 UTC
    iso-8859-1 can only represent a small subset of the characters supported by UTF-16. The conversion to iso-8559-1 may result in information loss.

    Just for future reference in case I run across this situation, is there a charset that can be used to circumvent this issue? Or am I stuck with the potential loss? Thanks.

      is there a charset that can be used to circumvent this issue?

      You mean "character encoding", not "character set". The question is:

      Is there a character encoding that can represent the Unicode character set?

      All UTF-* encodings can handle all Unicode characters.

      There's obviously something missing to the question since you started off with such a character encoding (UTF-16be).

      Also worth mentioning are the UCS-2* encodings. UCS-2le and UCS-2be are the fixed-width subsets of UTF-16le and UTF-16le. They can handle a big chunk of Unicode (U+0000..U+FFFF).

      Windows uses UCS-2le internally and uses this for its Wide interface. UTF-8 is the charset of choice elsewhere.

      In fact, unix terminals tend to expect UTF-8 these days. It kinda surprised me when you asked for iso-8859-1.