I think it expects to be handed decoded XML.
I've personally never used it with anything but ISO-Latin-1 (and haven't encountered any problems so far in this regard). But I think it's true it doesn't properly handle unicode, at least not multibyte encodings like UTF-16.
OTOH, I just converted an ISO-Latin-1 XML file to UTF-8 (and changed the "encoding=...", of course — though that simply appears to be ignored), and it seems to "work" at least in that - when I Data::Dumper the created object - the appropriate chars are passed through unmodified (encoded) — which probably is because it doesn't do any decoding at all, and simply treats everything as bytes... (part of the less-features-for-speed concept, I guess)
In reply to Re^3: Benchmarks of XML Parsers
by almut
in thread Benchmarks of XML Parsers
by ikegami
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |