comment on

I have a feeling that Dumper shouldn’t actually produce ambiguous output, so escaped something is what we should expect from it.

So unless I did something horribly wrong, this is the input (sprinkled with a Unicode character outside Latin-1 range for the sake of example; converted to a UTF-8 encoded byte stream for JSON), the input parsed as JSON, printed to STDOUT on a UTF-8 terminal with your lines added.

use Data::Dumper;
use Encode qw(encode);
use JSON;
use utf8;
use open ":std", ":encoding(UTF-8)";
#use open ":std", ":locale";  ## totally didn't do anything for me

my $j = qq/{ "Particípio passad&#337;": 1 }/;
my $jp = JSON->new->utf8;
my $d = $jp->decode(encode("UTF-8", $j));

print "$j\n";
print Dumper($d);
print Dumper($j);
print Dumper(encode("UTF-8", $j));
[download]

The output:

{ "Particípio passadő": 1 }
$VAR1 = {
          "Partic\x{ed}pio passad\x{151}" => 1
        };
$VAR1 = "{ \"Partic\x{ed}pio passad\x{151}\": 1 }";
$VAR1 = '{ "ParticÃpio passadÅ": 1 }';

According to the manual, evaling the Dumper output should give us back the original data, so the escaped wide characters in the string seem right to me. Peculiar is how, when given a UTF-8 byte stream, it will not escape things and dump something awkward instead (last line). With $Data::Dumper::Useqq set it produces a better-looking string:

$VAR1 = "{ \"Partic\303\255pio passad\305\221\": 1 }";

In reply to Re^2: JSON, Data::Dumper and accented chars in utf-8 by Ralesk
in thread JSON, Data::Dumper and accented chars in utf-8 by silentius

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.