samtregar has asked for the wisdom of the Perl Monks concerning the following question:
At the moment I'm writing the data using XML::Writer, with code like:
my $writer = XML::Writer->new(OUTPUT => $fh, DATA_MODE => 1, DATA_INDENT => 4); $writer->dataElement(foo => $bar); $writer->end();
Then, later I try to read it in again using XML::Simple:
my $data = XMLin($xml, %args);
This blows up when $bar contains characters that aren't legal for UTF-8:
not well-formed (invalid token) at line 25, column 102, byte 980 a +t /usr/local/krang/lib/i686-linux/XML/Parser/Expat.pm line 478
What is to be done?
UPDATE: Taking gmpassos's suggestion, I adopted a mechanism similar to XML::Smart. I created a sub-class of XML::Writer which will automatically Base64 encode character content which has illegal characters in it. This content is prefixed with a "!!!BASE64!!!" marker. I then created a sub-class of XML::Simple which will automatically decode these sections by looking for the marker.
It sure isn't pretty, but it sure does work. Maybe someday I'll come up with something more elegent, but until then I'm happy to mark this one FIXED in Bugzilla and move on. Thanks monks!
-sam
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: 8-bit Clean XML Data I/O?
by mirod (Canon) on Feb 20, 2004 at 23:30 UTC | |
by samtregar (Abbot) on Feb 21, 2004 at 00:02 UTC | |
|
Re: 8-bit Clean XML Data I/O?
by gmpassos (Priest) on Feb 21, 2004 at 02:15 UTC | |
by samtregar (Abbot) on Feb 21, 2004 at 03:51 UTC | |
|
Re: 8-bit Clean XML Data I/O?
by diotalevi (Canon) on Feb 20, 2004 at 23:05 UTC | |
by samtregar (Abbot) on Feb 20, 2004 at 23:22 UTC | |
by diotalevi (Canon) on Feb 21, 2004 at 00:25 UTC | |
by samtregar (Abbot) on Feb 21, 2004 at 00:38 UTC | |
by diotalevi (Canon) on Feb 21, 2004 at 01:39 UTC | |
|
Re: 8-bit Clean XML Data I/O?
by mr_mischief (Monsignor) on Feb 20, 2004 at 23:06 UTC | |
|
Re: 8-bit Clean XML Data I/O?
by iburrell (Chaplain) on Feb 21, 2004 at 00:22 UTC | |
by samtregar (Abbot) on Feb 21, 2004 at 00:32 UTC | |
by iburrell (Chaplain) on Feb 22, 2004 at 20:41 UTC | |
by samtregar (Abbot) on Feb 22, 2004 at 21:42 UTC |