SheridanCat has asked for the wisdom of the Perl Monks concerning the following question:
I have a data provider who is sending me XML with the following at the top:
<?xml version="1.0" encoding="UNICODE"?>
And, sure enough, if I look at this in certain editors such as Eclipse, there are Kanji characters in there.
When I try parsing this with XML::libXML, I get the following error:
text_file.xml:1: parser error : Unsupported encoding UNICODE <?xml version="1.0" encoding="UNICODE"?>
I get a similar error from XML::Simple, which is no surprised, I suppose.
I understand that expat has builtin encoding for UTF-8, ISO-8859-1, UTF-16, and US-ASCII. So, can anyone shed some light on how I can parse this unicode XML?
If it matters, I'm running ActiveState 5.8.3 on WinXP. Any bit of assistance is appreciated.
Regards, SheridanCat
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Unicode XML Parsing Problem
by ikegami (Patriarch) on Sep 23, 2005 at 18:26 UTC | |
|
Re: Unicode XML Parsing Problem
by bart (Canon) on Sep 23, 2005 at 18:38 UTC | |
|
Re: Unicode XML Parsing Problem
by Errto (Vicar) on Sep 23, 2005 at 18:52 UTC | |
by SheridanCat (Pilgrim) on Sep 23, 2005 at 20:35 UTC |