in reply to Decode XML &#xxxx; entities

You seem to confuse HTML entities with URI escaping. You should use HTML::Entities, not URI::Escape. Also, a HTML entities never looks like &#00FC; It's hexadecimal, so there must be an "x" in between, or keep it decimal. With HTML::Entities I get the expected result:
use HTML::Entities qw(decode_entities); use Devel::Peek; Dump decode_entities "ü"; Dump decode_entities "€"; __END__ SV = PV(0x5060c8) at 0x5051e8 REFCNT = 1 FLAGS = (TEMP,POK,pPOK) PV = 0x510920 "\374"\0 CUR = 1 LEN = 16 SV = PV(0x5060c8) at 0x5051e8 REFCNT = 1 FLAGS = (TEMP,POK,pPOK,UTF8) PV = 0x510920 "\342\202\254"\0 [UTF8 "\x{20ac}"] CUR = 3 LEN = 16
Note that the first result does not have the utf-8 flag on, but for Perl this does not matter if a codepoint < 256 is internally encoded as latin1 or utf8.

Replies are listed 'Best First'.
Re^2: Decode XML &#xxxx; entities
by saintmike (Vicar) on Dec 04, 2007 at 21:07 UTC
    You seem to confuse HTML entities with URI escaping.

    Nope, I was just giving an example of a similarily trivial transformation that's covered by a CPAN module.

    Also, a HTML entities never looks like &#00FC; It's hexadecimal, so there must be an "x" in between, or keep it decimal.

    Thanks, corrected in my original post.

    With HTML::Entities I get the expected result:

    Looks pretty good!