http://qs1969.pair.com?node_id=254478


in reply to Re: Latin-1 characters and XML
in thread Latin-1 characters and XML

And what do you use to output non-ascii characters as xml-escaped numeric entities? (e.g. 'ô' -> 'ô')
Is there an XML::Parser method to do this (if there is one, it's completely undocumented afaict), or do you use a seperate module?

Replies are listed 'Best First'.
Re3: Latin-1 characters and XML
by dragonchild (Archbishop) on Apr 30, 2003 at 21:26 UTC
    What reader method do I use to write XML-escaped entities?!? Think about that for a second. XML::Parser reads the XML. It doesn't write it.

    You want some XML writer, of which there are many. And, yes, they will work with Latin-1. Another option is to use something like Unicode::String.

    ------
    We are the carpenters and bricklayers of the Information Age.

    Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

    Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

      It doesn't to me seem entirely implausible that a module that can convert escaped, latin-1 characters into unicode could also manage the reverse process. It's a fairly reasonable thing to want to do, after all.
      Incidentally, I did try looking at XML::Writer (which I'd assume is one of the many), and it didn't seem to have a method to do this either.
      As it happened, I ended up using Unicode::String and a regex ( 's|([\200-\377])|sprintf("&#%i;", ord($1))|ge' ), which works, afaict.