in reply to Re^2: JSON and utf8 flag
in thread JSON and utf8 flag

Where?

use Devel::Peek; use JSON::XS; my $smile = "☻"; my $j = JSON::XS->new->latin1(1)->encode(["$smile"]); my $d = JSON::XS->new->utf8(1)->latin1(1)->decode($j); Dump $$d[0];
Output:
SV = PV(0x8f19788) at 0x8f323bc REFCNT = 1 FLAGS = (POK,pPOK,UTF8) PV = 0x8f4be68 "\342\230\273"\0 [UTF8 "\x{263b}"] CUR = 3 LEN = 10

I put data without UTF8 flag to JSON and I want to get data without UTF8 flag from this JSON.

How to do it?

Replies are listed 'Best First'.
Re^4: JSON and utf8 flag
by choroba (Cardinal) on Sep 06, 2017 at 12:26 UTC
    > I put data without UTF8 flag to JSON

    RFC7159:

    JSON text SHALL be encoded in UTF-8, UTF-16, or UTF-32.

    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,

      I know, but... Quote from JSON-XS documentation:

      ENCODING/CODESET_FLAG_NOTES
      The main use for latin1 is to relatively efficiently store binary data as JSON, at the expense of breaking compatibility with most JSON decoders.

      I need to store data as text. I do not want to use Storable, MessagePack or Sereal.

Re^4: JSON and utf8 flag
by ikegami (Patriarch) on Sep 07, 2017 at 18:04 UTC

    First of all, the UTF8 flag is irrelevant. I believe you are actually complaining that $smile ne $d->[0].


    Secondly, you claim your source code includes the following:

    my $smile = "☻";
    

    That's not possible unless you have use utf8;. You actually have the following:

    my $s_orig = "\xE2\x98\xBB";

    If interpreted as Unicode Code Points (as ->encode does), you have LATIN SMALL LETTER A WITH CIRCUMFLEX, followed by START OF STRING, followed by RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK.


    Finally, how to round trip:

    use Test::More tests => 1; use JSON::XS qw( decode_json encode_json ); my $s_orig = "\xE2\x98\xBB"; my $data_orig = [ $s_orig ]; my $json_utf8 = encode_json($data_orig); my $data_got = decode_json($json_utf8); my $s_got = $data_got->[0]; is($s_got, $s_orig);

    If you actually want to store a smile,

    use Test::More tests => 1; use JSON::XS qw( decode_json encode_json ); my $smile_utf8 = "\xE2\x98\xBB"; utf8::decode( my $s_orig = my $smile_utf8 ); my $data_orig = [ $s_orig ]; my $json_utf8 = encode_json($data_orig); my $data_got = decode_json($json_utf8); my $s_got = $data_got->[0]; is($s_got, $s_orig);
      The eq operator is not an indicator. See Devel::Peek.
      > cat 1.pl use Test::More tests => 1; use JSON::XS qw( decode_json encode_json ); my $s_orig = "\xE2\x98\xBB"; my $data_orig = [ $s_orig ]; my $json_utf8 = encode_json($data_orig); my $data_got = decode_json($json_utf8); my $s_got = $data_got->[0]; is($s_got, $s_orig); use Devel::Peek; print "orig $s_orig\n"; Dump $s_orig; print "\ngot $s_got\n"; Dump $s_got; > perl 1.pl 1..1 ok 1 orig ☻ SV = PV(0x9b946c8) at 0x9badb40 REFCNT = 1 FLAGS = (PADMY,POK,IsCOW,pPOK) PV = 0x9bb2808 "\342\230\273"\0 CUR = 3 LEN = 10 COW_REFCNT = 2 got ☻ SV = PV(0x9b94798) at 0x9ea91dc REFCNT = 1 FLAGS = (PADMY,POK,IsCOW,pPOK,UTF8) PV = 0x9bb7c10 "\303\242\302\230\302\273"\0 [UTF8 "\x{e2}\x{98}\ +x{bb}"] CUR = 6 LEN = 10 COW_REFCNT = 1