gizzlon has asked for the wisdom of the Perl Monks concerning the following question:
Read by this script:<?xml version="1.0" encoding="iso-8859-1" standalone="yes"?> <foo> <p>foo</p> <p>foo with an –</p> <p>some latin1 encoded chars: æøå ÆØÅ</p> <p>same, but this time whith an – .. æøå ÆØÅ</p> <p>same, but this thime with an ” instead .. : æøå ÆØÅ</p> </foo>
Produces:my $xmldata = XMLin( $ARGV[0], ForceArray=>1, KeyAttr=>{meta=>"name"} +, SuppressEmpty=>"") or die "Could not parse xml data: $!"; foreach my $f ( @{$xmldata->{'p'} } ) { print $f; print "\n"; } print Dumper($xmldata);
Looks like its double encoded?./test2.pl foo.xml foo Wide character in print at ./test2.pl line 12. foo with an – some latin1 encoded chars: æøå ÆØÅ Wide character in print at ./test2.pl line 12. same, but this time whith an – .. æøå ÆØÅ Wide character in print at ./test2.pl line 12. same, but this thime with an ” instead .. : æøå ÆØÃ
 +3; $VAR1 = { 'p' => [ 'foo', "foo with an \x{2013}", 'some latin1 encoded chars: æøå ÆØÅ', "same, but this time whith an \x{2013} .. \x{c3}\x{ +a6}\x{c3}\x{b8}\x{c3}\x{a5} \x{c3}\x{86}\x{c3}\x{98}\x{c3}\x{85}", "same, but this thime with an \x{201d} instead .. : + \x{c3}\x{a6}\x{c3}\x{b8}\x{c3}\x{a5} \x{c3}\x{86}\x{c3}\x{98}\x{c3}\ +x{85}" ] };
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Entities confuse encoding in XML::Simple
by moritz (Cardinal) on Jan 03, 2008 at 11:30 UTC | |
by Anonymous Monk on Jan 03, 2008 at 13:17 UTC | |
by Anonymous Monk on Jan 09, 2008 at 14:29 UTC | |
by gizzlon (Initiate) on Jan 09, 2008 at 18:02 UTC |