Other than requiring everyone to use 4-byte unicode (which I agree would make life a lot easier for us grunts!) ... what possible solutions do you have in mind?
For example, you complain that the output from the parser uses the :raw layer.
- How would you have the encoding pragma propagate appropriately? Maybe it's a lexically scoped pragma. (Think strict and warnings.)
- Can you pass an $fh in to XML::Parser? If you can, is its IO layer set correctly if you open it in your script?
I'm not sure what the right solution is in 5.8.x, let alone in modules (like PDF::Template which uses XML::Parser) that have to support 5.005, 5.6.x, and 5.8.x. (I have nightmares about this, frankly, especially because I speak only Latin-1 languages.)
You might be interested to see the plethora of discussions that the parrot-dev and perl6-language lists have been having about this. If they are having issues when working on the problem for over a year now, it's amazing that a bunch of modules that ad-hoc'ced together work at all, let along as well as they do!
------
We are the carpenters and bricklayers of the Information Age.
Then there are Damian modules.... *sigh* ... that's not about being less-lazy -- that's about being on some really good drugs -- you know, there is no spoon. - flyingmoose
I shouldn't have to say this, but any code, unless otherwise stated, is untested
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.