in reply to Re: Re: 8-bit Clean XML Data I/O?
in thread 8-bit Clean XML Data I/O?

There's no magic encoding here - you have to either use a pre-existing encode/decode or write your own. You might just use base64.

Replies are listed 'Best First'.
Re: Re: Re: Re: 8-bit Clean XML Data I/O?
by samtregar (Abbot) on Feb 21, 2004 at 00:38 UTC
    Huh. It's a possibility. One problem I have with it is how much harder it would be to debug. Right now I can open up an XML file and find problems by simple inspection. With all my data in base64 I'd have to process the XML before I could read it. Which is pretty hard if the XML parser won't parse it, for example!

    I wonder if I could make a sub-class of XML::Writer which Base64 encoded strings containing non-UTF-8 characters, and prefixed them with some kind of marker so I'd know to reverse the encoding when reading. Of course, then I'd need a sub-class of XML::Simple to get it back out again. Good lord, what a hack.

    -sam

      Yeah, that's the other thing. Invent your own minimalist encoding. Its gross and yucky and you might prefer to use a smaller character set like latin1 instead of utf-8. You'd at least be swapping all the utf-8 rules for validity with something that stays with the octet / character limit. Heck, why not encode for ASCII? You'll be specifying your XML file's encoding in the preamble so there isn't any ambiguity about the data representation on the consumption side. Its seven bits and nice to look at.