in reply to malformed UTF-8 character in JSON string in perl

Your problem does not seem to be related to JSON. I get the same without using JSON:
$ perl -E ' use utf8; use Data::Dumper; my $data = qq( { "cat" : "text – abcd" } ); print Dumper($data); ' $VAR1 = " { \"cat\" : \"text \x{2013} abcd\" } ";
If I do not use Data::Dumper, I obtain this:
$ perl -E 'use utf8; my $data = qq( { "cat" : "text – abcd" } ); say $data ' Wide character in say at -e line 4. { "cat" : "text – abcd" }
i.e. a warning but the right output. As previously mentioned by Corion, it is a feature of Data::Dumper to display the UTF-8 escape sequences.

Using binmode, as I already told you several days ago in an answer to your previous post with the same content, I no longer have any warning:

$ perl -E 'use utf8; my $data = qq( { "cat" : "text – abcd" } ); binmode STDOUT, ":utf8"; say $data; ' { "cat" : "text – abcd" }
Have you tried binmode?

Replies are listed 'Best First'.
Re^2: malformed UTF-8 character in JSON string in perl
by Yllar (Novice) on Aug 11, 2015 at 13:55 UTC

    I am not sure how you did you get the output "{ "cat" : "text – abcd" }". Just now I tried with binmode but I've got the same error again(see below).

    text ΤΗτ abcd

    Would you please try this code and confirm again please.
    use strict; use warnings; use utf8; use JSON; my $data = "text – abcd"; binmode STDOUT, ":utf8"; print "$data";

    please remember the char mentioned in the statement ("text – abcd")is not a regular dash.

      I know it is not a regular dash, and I didn't have a regular dash in my 3 examples above, as it can be seen in the first one with Data Dumper (showing the escape sequence), and also in the second one displaying the warning about wide character.

      This is the same example with first one regular dash and then an "irregular" one, using first Data Dumper to show the UTF-8 escape sequence on the irregular dash, and showing what I get with binmode:

      $ perl -e ' use strict; use warnings; use utf8; use Data::Dumper; my $data = "regular dash - other type of beast: – abcd"; print Dumper $data; binmode STDOUT, ":utf8"; print "$data"; ' $VAR1 = "regular dash - other type of beast: \x{2013} abcd"; regular dash - other type of beast: – abcd
      If this does not work for you and it does for me, I would suspect that either your version of Perl is too old (I am using 5.14) or that there is something wrong in your terminal configuration.

        Hi Laurent_R

        I really do not understand why it is not working for me. I am using perl 5, version 12, subversion 2.

      binmode STDOUT, ":utf8";
      But is your terminal in UTF-8 mode? Try running perl yourscript.pl > utf8.txt and opening utf8.txt as UTF-8 text file. If you need to output Unicode characters to terminal, try Encode::Locale and binmode STDOUT, ":encoding(console_out)" (assuming that your terminal uses encoding which does have these Unicode characters; on Windows you may want to run chcp 65001 first).