At the moment I'm writing the data using XML::Writer, with code like:
my $writer = XML::Writer->new(OUTPUT => $fh, DATA_MODE => 1, DATA_INDENT => 4); $writer->dataElement(foo => $bar); $writer->end();
Then, later I try to read it in again using XML::Simple:
my $data = XMLin($xml, %args);
This blows up when $bar contains characters that aren't legal for UTF-8:
not well-formed (invalid token) at line 25, column 102, byte 980 a +t /usr/local/krang/lib/i686-linux/XML/Parser/Expat.pm line 478
What is to be done?
UPDATE: Taking gmpassos's suggestion, I adopted a mechanism similar to XML::Smart. I created a sub-class of XML::Writer which will automatically Base64 encode character content which has illegal characters in it. This content is prefixed with a "!!!BASE64!!!" marker. I then created a sub-class of XML::Simple which will automatically decode these sections by looking for the marker.
It sure isn't pretty, but it sure does work. Maybe someday I'll come up with something more elegent, but until then I'm happy to mark this one FIXED in Bugzilla and move on. Thanks monks!
-sam
In reply to 8-bit Clean XML Data I/O? by samtregar
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |