in reply to Re: JSON, Data::Dumper and accented chars in utf-8
in thread JSON, Data::Dumper and accented chars in utf-8
I have a feeling that Dumper shouldn’t actually produce ambiguous output, so escaped something is what we should expect from it.
So unless I did something horribly wrong, this is the input (sprinkled with a Unicode character outside Latin-1 range for the sake of example; converted to a UTF-8 encoded byte stream for JSON), the input parsed as JSON, printed to STDOUT on a UTF-8 terminal with your lines added.
The output:use Data::Dumper; use Encode qw(encode); use JSON; use utf8; use open ":std", ":encoding(UTF-8)"; #use open ":std", ":locale"; ## totally didn't do anything for me my $j = qq/{ "Particípio passadő": 1 }/; my $jp = JSON->new->utf8; my $d = $jp->decode(encode("UTF-8", $j)); print "$j\n"; print Dumper($d); print Dumper($j); print Dumper(encode("UTF-8", $j));
{ "Particípio passadő": 1 } $VAR1 = { "Partic\x{ed}pio passad\x{151}" => 1 }; $VAR1 = "{ \"Partic\x{ed}pio passad\x{151}\": 1 }"; $VAR1 = '{ "ParticÃpio passadÅ": 1 }';
According to the manual, evaling the Dumper output should give us back the original data, so the escaped wide characters in the string seem right to me. Peculiar is how, when given a UTF-8 byte stream, it will not escape things and dump something awkward instead (last line). With $Data::Dumper::Useqq set it produces a better-looking string:
$VAR1 = "{ \"Partic\303\255pio passad\305\221\": 1 }";
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^3: JSON, Data::Dumper and accented chars in utf-8
by ikegami (Patriarch) on Jan 22, 2012 at 06:21 UTC | |
Re^3: JSON, Data::Dumper and accented chars in utf-8 [OFF/Gripe]
by Ralesk (Pilgrim) on Jan 21, 2012 at 22:11 UTC | |
by silentius (Monk) on Jan 21, 2012 at 22:42 UTC | |
by Ralesk (Pilgrim) on Jan 21, 2012 at 23:04 UTC | |
by ikegami (Patriarch) on Jan 22, 2012 at 06:24 UTC | |
by Ralesk (Pilgrim) on Jan 22, 2012 at 16:23 UTC |