in reply to Re^2: Converting HTML special entities to XML
in thread Converting HTML special entities to XML
They should always be expanded to UTF-8 and escaped on output. Your HTML parser should just give you Unicode, and whatever XML generator you use should be escaping it automatically for you as appropriate for the target encoding.
Don't attempt to transcode entities and what manually to insert literal bytes into the output XML stream. That way lies madness (and a lot of buggy code; most code dealing with XML out there is quite broken with regard to encodings).
Makeshifts last the longest.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: Converting HTML special entities to XML
by iburrell (Chaplain) on Sep 02, 2004 at 16:38 UTC | |
by Aristotle (Chancellor) on Sep 02, 2004 at 17:24 UTC |