in reply to Re^3: Is JSON::XS doing more things right here (flag)
in thread JSON::XS produces valid utf-8, and JSON doesn't - why?

Ok, ok. Although from my perspective, there's little sense in getting (2), it might make sense in situations beyond me.

Apart from that, my use case (where the problem occurred) requires "round trip integrity", meaning: Given a string, I can...
  1. -> encode it as JSON
  2. -> write it to db
  3. -> read it back
  4. -> decode from JSON
... and have the same Perl structure as before, or differently said, a structure behaving like before (this tiny differentiation to leave room for: ok, I didn't check if the utf-8 flag was set...)
With my "round trip", relying on JSON::to_json(), the data structure blew from_json() on the "decode from JSON" step.

(Original test script updated). Devel Peek output:

OK
SV = PV(0x4178620) at 0x4042060
  REFCNT = 1
  FLAGS = (PADMY,POK,pPOK,UTF8)
  PV = 0x4170620 "Hello \342\200\223 World"\0 UTF8 "Hello \x{2013} World"
  CUR = 15
  LEN = 24
SV = PV(0x41785f0) at 0x4049b18
  REFCNT = 1
  FLAGS = (PADMY,POK,pPOK,UTF8)
  PV = 0x4170590 "\"Hello \342\200\223 World\""\0 [UTF8 ""Hello \x{2013} World""]
  CUR = 19
  LEN = 40

CRASH
SV = PV(0x4178580) at 0x40a2548
  REFCNT = 1
  FLAGS = (PADMY,POK,pPOK,UTF8)
  PV = 0x4170600 "Hello \342\200\223 World"\0 UTF8 "Hello \x{2013} World"
  CUR = 15
  LEN = 24
SV = PV(0x41783d0) at 0x40a28d8
  REFCNT = 1
  FLAGS = (PADMY,POK,pPOK,UTF8)
  PV = 0x4176460 "\"Hello \342\200\223 World\""\0 [UTF8 ""Hello \x{2013} World""]
  CUR = 19
  LEN = 40

Huh? I don't see it... but wait! \x{2013} doesn't look like utf-8.. Should be \xE2\x80\x93, no?

This output was nonsense, as utf8::decode() altered the strings to be the same at the Dump() stage (as tye and Anon Monk pointed out) (test script updated) (new output here)

  • Comment on Re^4: Is JSON::XS doing more things right here (flag)

Replies are listed 'Best First'.
Re^5: Is JSON::XS doing more things right here (before)
by tye (Sage) on Nov 12, 2015 at 23:10 UTC

    You should also Dump the JSON string before you pass it to utf8:: functions.

    - tye