If you are using only plain chars ASCII would be enough.
But the producers of the XML file are telling you that it might not be so, so you should expect to see some non plain ascii chars in the future.
As bart correctly pointed out, telling XML::Simple that the stream of characters you are passing is in unicode format solves your problem until the moment when the xml actually contains any special character that is not in unicode but in latin-1.
So, in addition to changing the information about the encoding format you must also convert the stream to utf-8
Unfortunately, examples like the one you brought here are abundant on the web, where the encoding info is not accurate and many people suffer from headaches dealing with such misleadings.
Some techniques are used to try and avoid such problems, like parsing the info assuming unicode format, and if that fails try the conversion from another encoding, and so on...
I'd recommend the following reading:
Perl Unicode
The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.