in reply to Re: Re: 8-bit Clean XML Data I/O?
in thread 8-bit Clean XML Data I/O?
Now, most systems deal with this by context. Everyone uses the same encoding for input and output and it all works. Until someone uses a different locale. Or they cut-and-paste from an app that doesn't declare the encoding. Or they send the file/email/database to someone else.
Also, XML is logically defined as using Unicode characters. Files either have the default encoding of UTF-16 or UTF-8, or they must declare the encoding. Many parsers will convert from the declared encoding to Unicode strings and only deal with Unicode.
Your choices are to: a) figure out what encoding is being used and mark the XML with that; b) generate invalid XML by not marking the encoding and using 8-bit bytes instead of UTF-8; c) finding a safe encoding and transforming the Unicode back into binary bytes; d) transcoding to UTF-8 and using that everywhere. a and d are the best solutions and are standard.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Re: Re: Re: 8-bit Clean XML Data I/O?
by samtregar (Abbot) on Feb 22, 2004 at 21:42 UTC |