http://qs1969.pair.com?node_id=85348


in reply to What are you expecting XML to be in?
in thread Converting character encodings

Actually XML uses UTF-8 or UTF-16 by default (and has ways to figure out which one is used), but allows any encoding, as long as it is specified in the XML declaration (as <?xml version="1.0" encoding="whatever"?>). The parser then has to deal with the encoding.

It is an implementation choice in expat (and then in XML::Parser) that all strings are passed to the handlers in UTF-8, but I don't think the XML spec mandates this choice.

And because the environment in which the XML is used often does not support UTF-8, but rather latin 1 or shift-JIS or whatever it is often very important (and painful!) to convert all strings back to their original encoding.