in reply to malformed UTF-8 character in JSON string in perl

What's wrong with Text::Unidecode?

Also, why/how is that UTF-8 character malformed? Are you sure that your script is in the correct encoding for your stanza to work? I prefer to be explicit when using characters above 127 in my scripts and use \N{PILE OF POO} instead of inserting the character verbatim into the source code in the hope that it won't get mangled.

Upon rereading your post, if you use Data::Dumper, the escaping of the output string is a feature. Maybe you can be more explicit in what output you get and what output you really want, and whether the output should be ASCII or UTF-8, and whether the input should be ASCII or UTF-8.

Replies are listed 'Best First'.
Re^2: malformed UTF-8 character in JSON string in perl
by Yllar (Novice) on Aug 11, 2015 at 15:04 UTC

    There is nothing wrong with Text::Unidecode, but I dont want to use this module. I would like to decode the unicode chars like single quotes, hypen, double quotes,bullets from my input data to ASCII chars.

    Can you please print the below two statements and see the outputs.

    1) "text - abcd" 2) "text – abcd"

    when you print you will get the same out for 1st one. But for second one you will the output like "text ΤΗτ abcd" which I do not want.

      There is nothing wrong with Text::Unidecode, but I dont want to use this module.
      Then you are SOL, aren't you? This is the easiest way to accomplish what you're after...So what? Is this homework?