When you benchmark, be sure to time the building of a string to output as Genx won't have a handle to write to.

One problem with XML 1.0 is that they made some stupid decisions with regard to control characters. This is likely fixed in the next version of the XML spec (which I assume is still not finished).

In my experience, the majority of XML parsers are actually non-complient on this point (perhaps a form of civil disobedience or a subconscious revolt against a design misfeature?) so producing non-complient XML has a practical advantage for me. If Genx is complient on this point, then that will probably be too much thrash to be worth the minor benefit.

When XML 1.1 becomes available, then the stupid design decision is restricted to nul characters, which is an acceptable compromise. Which means that using Genx and letting the user select which version of XML they want output would be great.

Only being able to produce UTF-8 may have some interesting consequences. We have a hard time getting people to deal with encodings with XML correctly. The change will likely cause some disruption. It may ease some problems. For example, cbhistory still produces UTF-8 output but claims it is Latin-1 (because it feeds Latin-1 to its XML parser but the parser insists on producing UTF-8 output and the author didn't appreciate this fact). So such a change might fix this problem and/or may cause it to appear more places. I just mention this in hopes that this somewhat minor point will be properly addressed if a change is made.

- tye        


In reply to Re: XML::Fling begone? (ctrl, utf-8) by tye
in thread XML::Fling begone? by Aristotle

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.