I get an error ...
Let me guess: It starts with THIS IS A TOP SECRET ERROR MESSAGE! NEVER POST THIS ERROR MESSAGE ANYWHERE! ESPECIALLY NOT AT PERLMONKS! A KITTEN WILL DIE IF YOU POST IT!.
... when I parse xml in XML::Parser when it gets to a unicode character.
So the XML is likely broken. Did you try to validate it? If the validation fails, the software that generated the XML has a bug. Also try to read the XML using XML::LibXML.
Maybe the XML has an unusual encoding? Default is UTF-8, but ISO-8859-1 and Windows-1252 are quite common. Perhaps the XML lacks an explicit encoding declaration, but uses a non-UTF-8 encoding?
Maybe XML::Parser has problems with XML delivered in a non-UTF-8 encoding? There is a clear hint in the documentation that you need to install some extra files for encodings other than UTF-8, ISO-8859-1, UTF-16 and US-ASCII.
The company I am writing code for wants the unicode characters converted this way.
"Der Kunde ist König." (The customer is king.) But still, this is just stupid. Dropping accents, tildes and other "letter add-ons" can sometimes change the meaning of the text.
Alexander
In reply to Re^3: converting unicode string to ascii or encoded
by afoken
in thread converting unicode string to ascii or encoded
by dmn001
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |