in reply to Re^2: Encoding is a pain.
in thread Encoding is a pain.

The character versus data distinction is important. XML does have a way to express non-ASCII characters using the DTD as noted. For true binary data CDATA tags almost do it, but they're not foolproof since the binary data could contain sequences that would make the tag look like it ended before it really did. But you could encode using an agreed upon scheme, such as uuencode or base64 encoding and put that in CDATA tags. Ugly, but possible.

Replies are listed 'Best First'.
Re^4: Encoding is a pain.
by grantm (Parson) on Sep 25, 2004 at 20:08 UTC
    Actually, in XML 1.0 CDATA sections are no good for binary data even without the delimiter issue. A CDATA section is defined to contain Chars, which in turn are defined as:

    Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000- +#x10FFFF]

    So for example control characters in the range 0x00 - 0x08 are not allowed. There are also encoding issues which would prevent you putting binary bytes in CDATA.