codewalker has asked for the wisdom of the Perl Monks concerning the following question:

How to convert the Unicode entities for Japanese into Japanese character

Also know the Unicode entity list for Japanese characters along with named entity list

  • Comment on To convert the unicode entity (症) to japanese character 在

Replies are listed 'Best First'.
Re: To convert the unicode entity (症) to japanese character 在
by SuicideJunkie (Vicar) on Mar 02, 2015 at 14:26 UTC
    hex(75c7) = 30151 != 22312

    Do you perhaps want to decode HTML Entities?

    Or maybe tr/// the first character into the second?

Re: To convert the unicode entity (症) to japanese character 在
by graff (Chancellor) on Mar 03, 2015 at 18:20 UTC
    How to convert the Unicode entities for {anything} into {any} character:
    s/\&#(\d+);/chr($1)/ge; # decimal char. entity s/\&#x([\da-f]+);/chr(hex($1))/ige; # hex char. entity
    Also know the Unicode entity list for Japanese characters:

    http://www.unicode.org/charts/

Re: To convert the unicode entity (症) to japanese character 在
by ikegami (Patriarch) on Mar 03, 2015 at 18:02 UTC
    The HTML parser you are using to extract the text from the HTML should already be doing that for you.