Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re: Latin-1 characters and XML

by dragonchild (Archbishop)
on Apr 30, 2003 at 13:40 UTC ( #254272=note: print w/replies, xml ) Need Help??


in reply to Latin-1 characters and XML

I'm using XML::Parser with 5.005_3 and I've got Latin-1 and UTF-8 (for double-byte). (This is all in a production system.)

------
We are the carpenters and bricklayers of the Information Age.

Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

Replies are listed 'Best First'.
Re: Re: Latin-1 characters and XML
by kilinrax (Deacon) on Apr 30, 2003 at 21:13 UTC
    And what do you use to output non-ascii characters as xml-escaped numeric entities? (e.g. 'ô' -> 'ô')
    Is there an XML::Parser method to do this (if there is one, it's completely undocumented afaict), or do you use a seperate module?
      What reader method do I use to write XML-escaped entities?!? Think about that for a second. XML::Parser reads the XML. It doesn't write it.

      You want some XML writer, of which there are many. And, yes, they will work with Latin-1. Another option is to use something like Unicode::String.

      ------
      We are the carpenters and bricklayers of the Information Age.

      Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

      Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.

        It doesn't to me seem entirely implausible that a module that can convert escaped, latin-1 characters into unicode could also manage the reverse process. It's a fairly reasonable thing to want to do, after all.
        Incidentally, I did try looking at XML::Writer (which I'd assume is one of the many), and it didn't seem to have a method to do this either.
        As it happened, I ended up using Unicode::String and a regex ( 's|([\200-\377])|sprintf("&#%i;", ord($1))|ge' ), which works, afaict.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://254272]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (5)
As of 2022-05-24 15:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Do you prefer to work remotely?



    Results (84 votes). Check out past polls.

    Notices?