kevin_i_orourke has asked for the wisdom of the Perl Monks concerning the following question:

I've been playing around with XML as a way of easing maintenance of my personal website, but mostly just as something to play around with.

Recently I've been using XML::XPath and it seems to do most of the thing I want, however it also seems to strip out entities from the XML it reads.

For example, I have an XHTML input file which I read in, process and write out. When I read it in it contains entities such as é (that's é), when I write it out again they have been deleted. This means that things like 'Port aux Français' come out as 'Port aux Franais', not good.

Any ideas how I can stop this?

--
Kevin O'Rourke

Replies are listed 'Best First'.
Re: XML::XPath and entities
by Gloom (Monk) on May 29, 2001 at 16:09 UTC
    Be sure that your xml source have the right encoding, that is ISO-8859-1 for latin1 ( including french characters ). I dont know if it's the solution, but maybe it's a clue =)

    Gloom
    ____________________
    Hope this helps

      Unfortunately this doesn't seem to make any difference. I had a play with the different styles of XML::Parser and most of them seem to do the same, just deleting the entities. If I provide my own handlers the entities appear as raw character data.

      Looks like I'm doing something wrong.

      --
      Kevin O'Rourke