in reply to JSON::XS (and JSON::PP) appear to generate invalid UTF-8 for character in range 127 to 255
So the string containing guillemets («») is valid UTF-8, but the resulting JSON is not. What am I missing? The `utf8` pragma is correctly marking my source.JSON::XS says:
(encode_json) Converts the given Perl data structure to a UTF-8 encoded, binary string (that is, the string contains octets only). Croaks on error.Test::utf8 says:
(is_sane_utf8) This test fails if the string contains something that looks like it might be dodgy utf8, i.e. containing something that looks like the multi-byte sequence for a latin-1 character but perl hasn't been instructed to treat as such... This test fails when... The string contains utf8 byte sequences and the string hasn't been flagged as utf8 (this normally means that you got it from an external source like a C library;Apparently it tests whether the string was properly decoded... (I'm not familiar with it). I guess you need to Encode::decode_utf8 it, before feeding the string to the second is_sane_utf8 (Test::utf8 has an example, with Encode::_utf8_on)
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: JSON::XS (and JSON::PP) appear to generate invalid UTF-8 for character in range 127 to 255
by ikegami (Patriarch) on Dec 07, 2014 at 04:20 UTC | |
|
Re^2: JSON::XS (and JSON::PP) appear to generate invalid UTF-8 for character in range 127 to 255
by Anonymous Monk on Dec 07, 2014 at 03:11 UTC | |
by Anonymous Monk on Dec 07, 2014 at 04:36 UTC | |
by ikegami (Patriarch) on Dec 07, 2014 at 05:27 UTC | |
by Anonymous Monk on Dec 07, 2014 at 06:53 UTC | |
by ikegami (Patriarch) on Dec 07, 2014 at 15:33 UTC | |
|