in reply to Re^3: Trouble with Chinese characters ( binmode )
in thread Trouble with Chinese characters

It's not so much the warnings I'm worried about. I'm fully aware that I can get rid of the warnings. What I'm confused about (and I admit, I should of been more clear on this) is why one script will print out the Chinese character I want and the other prints out 女. I am more concerned with the "why" and not a solution. I feel that if I know why something is happening I can find a solution on my own (not to say I wouldn't appreciate one).
  • Comment on Re^4: Trouble with Chinese characters ( binmode )

Replies are listed 'Best First'.
Re^5: Trouble with Chinese characters ( binmode )
by 1nickt (Canon) on Sep 04, 2015 at 04:12 UTC

    Hi dweston. Sorry we didn't make you feel more welcome. Hopefully you've figured it out now from the earlier links I posted, and from your replies on StackOverflow. If not, read on.

    The reason the first program fails to print out the Chinese character is that you never tell Perl that it has non-ASCII characters in the source. In the second program you don't say so either, but you encode the characters so Perl outputs them properly. In the first script, Perl tries to output as ASCII, which cuts off the high unicode characters since they span more than one octet.

    To make Perl correctly output the Chinese characters in your first program, simply tell it what it's dealing with by stating use utf8; at the top.

    
    #!/usr/bin/perl
    use strict;
    use warnings;
    use utf8;
    use JSON;
    
    my %genders_zh = (
      'Female'           => '女',
      'Male'             => '男',
      'Decline to State' => '',
    );
    
    my $gender   = 'Female';
    my $hash_ref = {};
    
    $hash_ref->{'detail_sex'} = $genders_zh{ $gender }; 
    
    print JSON->new->utf8( 1 )->pretty( 1 )->encode( $hash_ref );
    __END__
    
    
    Output:
    
    $ perl 1140925.pl
    {
       "detail_sex" : "女"
    }
    $
    

    Note that you are only avoiding the "Wide character in print" warning because you tell JSON to encode your output as utf8. If your statement didn't include the ->utf8( 1 ), you would have to deal with it another way, e.g. by calling binmode on your STDOUT:

    
    #!/usr/bin/perl
    use strict;
    use warnings;
    use utf8;
    binmode STDOUT, 'utf8';
    
    my %genders_zh = (
      'Female'           => '女',
      'Male'             => '男',
      'Decline to State' => '',
    );
    
    my $gender   = 'Female';
    
    print $genders_zh{ $gender }, "\n"; 
    
    __END__
    
    Output:
    
    $ perl 1140925.pl
    女
    $
    

    Hope this helps.

    The way forward always starts with a minimal test.