in reply to Re: Exporting HTML in an XML document
in thread Exporting HTML in an XML document

The native / default encoding for XML is UTF-8.

If you look at my code, you'll see that it attempts to determine the charset of the HTML code, and that when it exports the "original" code in XML, the "encoding" attribute is set to that character set in the <original> element.

I've ended up decoding that chunk of HTML into UTF-8 and exporting it that way as well. Any attempts to do this with arbitrary data with non-UTF8 charsets have failed.

--telcontar