in reply to XML::Parser - Keep Encoding?

Actually, if you are using a modern (5.8) perl, the internal format is utf8 (or close enough, IIRC it is actually a superset of utf8).

On output though, unless you specify that you want utf8, perl will convert the data back to ISO-8859-1, if possible. Telling perl to output the data in utf8 is done, as ikegami mentioned, using the binmode function.

Replies are listed 'Best First'.
Re^2: XML::Parser - Keep Encoding?
by ikegami (Patriarch) on Apr 21, 2009 at 15:50 UTC

    Actually, if you are using a modern (5.8) perl, the internal format is utf8 (or close enough, IIRC it is actually a superset of utf8).

    Almost. The standard character set is UTF-8 (case-insensitive, with a dash).

    The internal format is locally known (non-standard, Perl-only) as utf8. It's a superset of UTF-8 capable of representing all 32-bit or 64-bit numbers (depending on the system, for some definition of system).