in reply to Re^7: Another utf-8 decoding problem
in thread Another utf-8 decoding problem

Hmm. I can't seem to get it correct whatever i use. I used Devel::Peek to check the utf-8 flag, and it seems like it's set. Here are some different outputs:


SV = PV(0xb8de060) at 0xbbabfa0 REFCNT = 1 FLAGS = (TEMP,POK,pPOK) PV = 0xbb6f730 "traningsredskap"\0 CUR = 15 LEN = 16 SV = PV(0xb8dddd0) at 0xa3b0800 REFCNT = 1 FLAGS = (PADMY,POK,pPOK,UTF8) PV = 0xbb8bc98 "traningsredskap"\0 [UTF8 "traningsredskap"] CUR = 15 LEN = 16 SV = PV(0xb8dddd0) at 0xa3b0800 REFCNT = 1 FLAGS = (PADMY,POK,pPOK,UTF8) PV = 0xbb6d3d0 "traningsredskap"\0 [UTF8 "traningsredskap"] CUR = 15 LEN = 16 SV = PV(0xb8de060) at 0xbbabfa0 REFCNT = 1 FLAGS = (TEMP,POK,pPOK) PV = 0xbb72668 "traningsredskap"\0 CUR = 15 LEN = 16
using

my $str = $original_value; Dump $str; $str = decode("utf-8", $str); Dump $str; Dump encode('latin1', $str);

Replies are listed 'Best First'.
Re^9: Another utf-8 decoding problem
by DreamT (Pilgrim) on Oct 11, 2010 at 14:19 UTC
    Sorry, meant

    SV = PV(0x9cf0060) at 0x9fbde50 REFCNT = 1 FLAGS = (TEMP,POK,pPOK) PV = 0x9fa6988 "Tr?ningsredskap"\0 CUR = 15 LEN = 16 SV = PV(0x9cefdd0) at 0x87c2800 REFCNT = 1 FLAGS = (PADMY,POK,pPOK,UTF8) PV = 0x9f9b490 "Tr\303\244ningsredskap"\0 [UTF8 "Tr\x{e4}ningsredska +p"] CUR = 16 LEN = 20 SV = PV(0x9cefdd0) at 0x87c2800 REFCNT = 1 FLAGS = (PADMY,POK,pPOK,UTF8) PV = 0x9f7ca38 "Tr\357\277\275ningsredskap"\0 [UTF8 "Tr\x{fffd}nings +redskap"] CUR = 17 LEN = 20 SV = PV(0x9cf0060) at 0x9fbde50 REFCNT = 1 FLAGS = (TEMP,POK,pPOK) PV = 0x9f6b8c8 "Tr?ningsredskap"\0 CUR = 15 LEN = 16
      SV = PV(0x9cf0060) at 0x9fbde50 REFCNT = 1 FLAGS = (TEMP,POK,pPOK) PV = 0x9fa6988 "Tr?ningsredskap"\0 CUR = 15 LEN = 16

      This looks like Latin-1

      SV = PV(0x9cefdd0) at 0x87c2800 REFCNT = 1 FLAGS = (PADMY,POK,pPOK,UTF8) PV = 0x9f9b490 "Tr\303\244ningsredskap"\0 [UTF8 "Tr\x{e4}ningsredska +p"] CUR = 16 LEN = 20

      A proper string in Perl's internal format. Should be fine to print out if you add that IO layer, or put it through Encode::encode.

      SV = PV(0x9cefdd0) at 0x87c2800 REFCNT = 1 FLAGS = (PADMY,POK,pPOK,UTF8) PV = 0x9f7ca38 "Tr\357\277\275ningsredskap"\0 [UTF8 "Tr\x{fffd}nings +redskap"] CUR = 17 LEN = 20

      This is wrong. It means you decoded something the wrong character encoding.

      Perl 6 - links to (nearly) everything that is Perl 6.
        "A proper string in Perl's internal format. Should be fine to print out if you add that IO layer, or put it through Encode::encode. " - Even if the browser environment is iso-8859-1? :-)