Welcome to the wonderful world of XML!
I can't figure out exactly what is your original format but I will nevertheless go for the shameless plug:.
<shameless_plug>XML::Twig will happily deal with this problem. get the latest version (3.00) from here and you won't have to bother with entities being dropped.</shameless_plug>
Try playing with this code (with and without the keep_encoding option for example):
#!/bin/perl -w use strict; use XML::Twig; my $t= new XML::Twig( keep_encoding => 1); { $/= ''; while( <DATA>) { $t->parse( $_); $t->print; print "\n"; } } __DATA__ <?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE doc SYSTEM "dummy"[]> <doc att="valué ">A document with text in latin1: soupçonné d'être</do +c> <?xml version="1.0"?> <!DOCTYPE doc SYSTEM "dummy"[]> <doc att="valué">A document with text in latin1:soupçonn +é d'etre</doc>
In reply to Re: XML and entities, what am I doing wrong?
by mirod
in thread XML and entities, what am I doing wrong?
by kevin_i_orourke
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |