in reply to JSON::XS produces valid utf-8, and JSON doesn't - why?

Can you perhaps describe what the inconsistency is?
  • Comment on Re: Is JSON::XS doing more things right here,

Replies are listed 'Best First'.
Re^2: Is JSON::XS doing more things right here,
by isync (Hermit) on Nov 12, 2015 at 19:23 UTC
    post updated: JSON::XS produces "valid" utf-8, and JSON does not.

      "does not" is one of most vague problem descriptions; right up there with "doesn't work".

      A Perl module that produces JSON has two valid choices of what to output: 1) A UTF-8 string that has been tagged as being UTF-8 (from Perl's perspective). 2) A UTF-8 string encoded as a sequence of bytes from Perl's perspective [same as (1) just without the "is utf-8" flag set].

      Your code presumes only one of those possibilities. I know that JSON::XS defaults to returning (1) but can be asked to return (2). I haven't looked into what JSON::PP gives.

      When dealing with UTF-8 problems in Perl, it is best to use Devel::Peek.

      - tye        

        Ok, ok. Although from my perspective, there's little sense in getting (2), it might make sense in situations beyond me.

        Apart from that, my use case (where the problem occurred) requires "round trip integrity", meaning: Given a string, I can...
        1. -> encode it as JSON
        2. -> write it to db
        3. -> read it back
        4. -> decode from JSON
        ... and have the same Perl structure as before, or differently said, a structure behaving like before (this tiny differentiation to leave room for: ok, I didn't check if the utf-8 flag was set...)
        With my "round trip", relying on JSON::to_json(), the data structure blew from_json() on the "decode from JSON" step.

        (Original test script updated). Devel Peek output: