You need to add the encoding used in the document at the start of it: <?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>. The entities are then converted to the proper numerical entities.
You may also want to use mirod's XML::Twig, which, in the latest version (3.0) is able to keep the original encoding. And besides this cool feature, I find Twig easier to use than DOM to process XML docs. Here's what your code would look like with Twig:
#!/usr/bin/perl -w use strict; use XML::Twig 3.0; my $parser = new XML::Twig( keep_encoding => 1 ); my $xmlstring=<<"XMLEND"; <?xml version="1.0" encoding="ISO-8859-1"?> <ACTION> <INPUT LABEL="Radio Button"/> <INPUT LABEL="été"/> <RADIO ID="List"> éééàààùùù </RADIO> </ACTION> XMLEND $parser->parse($xmlstring); $parser->print;
Hope this helps!
update: version 3.0 of XML::Twig can be found here
<kbd>--In reply to Re: XML::Parser multilanguage support
by OeufMayo
in thread XML::Parser multilanguage support
by lucdewav
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |