http://qs1969.pair.com?node_id=11133175


in reply to XML::Simple and ISO-8859-1 encoding buggy?

XML::Simple's design is extremely problematic. So much so that the module's own documentation tells you not to use it. wtf are you doing using this module?!


XML::Simple and ISO-8859-1 encoding buggy?

Decoding is handled by the XML parser. You didn't specify which XML parser you are using. (No, XML::Simple is not an XML parser.) XML::Parser is commonly used by XML::Simple, and XML::Parser handles iso-8859-1 just fine.

use 5.014; use warnings; use XML::Simple qw( :strict ); # Taken from OP. use File::Slurper qw( read_binary ); my $xml = read_binary($ARGV[0]); # Make sure we know which parser is being used. local $XML::Simple::PREFERRED_PARSER = 'XML::Parser'; # Taken from OP. my $doc = XMLin($xml, ForceArray => 1,KeyAttr => [ ]); say sprintf "%vX", $doc;
$ perl a.pl a_latin1.xml E9 $ perl a.pl a_utf8.xml E9

On a terminal execting UTF-8:

$ cat a_utf8.xml
<?xml version="1.0"?><root>é</root>

$cat a_latin1.xml | iconv -f iso-8859-1
<?xml version="1.0" encoding="ISO-8859-1"?><root>é</root>

Seeking work! You can reach me at ikegami@adaelis.com