in reply to UTF-8 to ISO-8859-1 conversion of euro symbol

There is no euro symbol in iso-8859-1 - you want Latin 9 (ISO-8859-9) instead. See http://www.cs.tut.fi/~jkorpela/latin9.html for the differences.

/J\

  • Comment on Re: UTF-8 to ISO-8859-1 conversion of euro symbol

Replies are listed 'Best First'.
Re^2: UTF-8 to ISO-8859-1 conversion of euro symbol
by mirod (Canon) on Mar 03, 2005 at 14:12 UTC

    Actually if you read more carefully the article you'll see that Latin 9 is referenced as ISO 8859-15, not 9: The ISO Latin 9 (ISO 8859-15)...

    updated:: added text in italics, the original was a bit rude, sorry.

      Yes absolutely correct, the mistake arose between between brain and keyboard - I originally type 8859-1 in both places then changed the second one but, er, incorrectly. I blame the finance consultant trying to talk someone through configuring their VPN loudly on the other side of the office ;-)

      /J\

Re^2: UTF-8 to ISO-8859-1 conversion of euro symbol
by Anonymous Monk on Mar 03, 2005 at 14:30 UTC
    Taking both comments in to consideration, I have no switched my encoding to ISO-8859-15. I'm now seeing utf-8 character 189 (ISO-8859 character 164: ¤) with my original code (s/8859-1/8859-15/) and template...

    These encodings seem to propogate like rabbits. I'm glad someone knows the difference between them.

Re^2: UTF-8 to ISO-8859-1 conversion of euro symbol
by Anonymous Monk on Mar 03, 2005 at 14:21 UTC
    Thank you for the clarification. I had just ISO-8859 written on my notes and assumed I meant -1. I have changed my encoding to ISO-8859-9.

    Unfortunately, this has the same result using the utf-8 typed euro character: I am given whitespace by from_utf8(). A stand-alone test case also shows that as the response from the function.

      As mirod pointed out I was typing crap - it is (confusingly) iso-8859-15 you should be using. Are you sure that whatever it is that you are using to look at the output is actually using the correct character set to display the euro character from that encoding? There is also a windows cp1252 that includes the character but with a different numeric code.

      /J\

        I am not 100% certain that the encoding at the other end is an ISO-8859 variant. It does correctly display other characters encoded in ISO-8859, however, such as ø.

        So, if I have 8364 in utf-8, that's transformed to 164 in ISO-8859-15. 0164 in Windows CP1252 is what I'm seeing...so I will see if switching the encoding helps.

        Thanks for all the help!

        Working on the theory that I might need cp1252 encoding, I modified my original code to convert to charset cp1252. The result is whitespace.