dweston has asked for the wisdom of the Perl Monks concerning the following question:

Final edit: After reading the replies I've come to the conclusion that this was the wrong place to go to. I've had more replies about the politics and manners of the monks than an actual discussion about code. If someone responds to this with an explanation or a solution then I hope that this thread will come up in a google search for any beginners that have the same problem I do and serve as a warning. Hello, I've been trying to pass a Chinese character to a JSON hash but it always comes out as "女"

#!/usr/bin/perl use JSON; #variable declaration my $gender = "Female" #turning english selection to Chinese character if ($gender eq 'Female') { $gender = "&#22899;"; } elsif ($gender eq 'Male') { $gender = "&#30007;"; } elsif ($gender eq 'Decline to state') { $gender = ""; } my $hash_ref = {}; $hash_ref->{'detail_sex'} = $gender; print JSON->new->utf8(1)->pretty(1)->encode($hash_ref);<\code> <p>This is the result I get: { "detail_sex" : "女" } However, when I test another script it comes out perfectly.</p> <code>#!/usr/bin/perl use Digest::MD5 qw(md5 md5_hex md5_base64); use Encode qw(encode_utf8); use JSON; my $userid = 1616589; my $time = 2015811; my $ejob_id = 1908063; # md5 encryption without chinese characters my $md5_hex_sign = md5_hex($userid,$time,$job_id); print "$md5_hex_sign\n"; # seeing if character will print print "let's try encoding and decrypting \n"; print "the character to encrypt.\n"; print "&#22899;\n"; print "unicode print out\n"; print "\x{5973}\n"; my $char = "\x{5973}"; my $sign_char = "&#22899;"; print "unicode stored in \$char variable \n"; print $char, "\n"; print "md5 encryption of said chinese character from \$char with utf8 +encoding\n"; print md5_hex(encode_utf8($char)), "\n"; print "md5 encryption of wide character with utf8 encoding\n"; print md5_hex(encode_utf8("&#22899;")), "\n"; my $sign_gender = md5_hex(encode_utf8($sign_char)); #JSON print "JSON print out\n"; my $hash_ref = {}; $hash_ref->{'gender'} = $char; $hash_ref->{'md5_gender'} = md5_hex(encode_utf8($char)); $hash_ref->{'char_gender'} = md5_hex(encode_utf8("&#22899;")); $hash_ref->{'sign_gender'} = $sign_gender; print JSON->new->utf8(1)->pretty(1)->encode($hash_ref);

Here is the result: 160a6f4bf9aec1c2d102330716ca8f4e let's try encoding and decrypting the character to encrypt. 女 unicode print out Wide character in print at md5check.pl line 18. 女 unicode stored in $char variable Wide character in print at md5check.pl line 22. 女 md5 encryption of said chinese character from $char with utf8 encoding 87c835a6b1749374a7524a596087b296 md5 encryption of wide character with utf8 encoding 06c82a10da7e297180d696ed92f524c1 JSON print out { "char_gender" : "06c82a10da7e297180d696ed92f524c1", "md5_gender" : "87c835a6b1749374a7524a596087b296", "sign_gender" : "06c82a10da7e297180d696ed92f524c1", "gender" : "女" } Would someone kindly explain to me what is going on?

Replies are listed 'Best First'.
Re: Trouble with Chinese characters
by choroba (Cardinal) on Sep 03, 2015 at 22:55 UTC
    Crossposted at StackOverflow. Note that it's considered polite to inform about crossposting as there are people not attending both the sites, and some of them might waste time hacking a solution for a problem already solved at the other end of the internet.

    Also, there's a semicolon missing on line 5.

    لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
Re: Trouble with Chinese characters ( binmode )
by 1nickt (Canon) on Sep 03, 2015 at 21:34 UTC

    See this node from all of two days ago.

    The way forward always starts with a minimal test.
      This is a brand new user to PM. I'd advise diplomacy, and an explanation as to how your piece fits within the question.

        Well, I don't think it's undiplomatic to expect a brand new user to take a couple of minutes to read before posting his question. I would certainly read the last few discussions on any board before I posted (not to mention use the search function). Lamentably, most people who come to forums like this one want to get instant gratification in the form of someone handing them a solution as soon as they ask their question, which is right after they create a user account (or not), which is right after they visit the site for the first time.

        It's a courtesy to every such site in the world to search before asking, and there are plenty of notices that this is expected at PerlMonks. Most new users don't do that, and it generally goes unadmonished. But when the answer to the question and plenty of links to documentation are contained in a post that is still on the front page of most views, I think it's fair to point out that fact, and even to the fact that the user could have found it with extremely minimal effort, which I did by adding the words "all of" to my post.

        I also linked to the node so he could read the whole discussion, and I edited the subject to reference the use of binmode, which, as you pointed out in that thread, is what he needs. I really think overall that was low on the undiplomacy scale, but then I am not the best judge of that. ++your post because you might be.

        The way forward always starts with a minimal test.