in reply to Re^6: Scrubbing XML
in thread Scrubbing XML

Ah! News to me! If libxml does it, I'm sure it's in the spec. And here it is: 3.3.3 Attribute-Value Normalization.

  1. \r\n is is converted to \n. (Done for the entire document.)
  2. Entity references (e.g. é) are interpolated.
  3. \r, \n and \t are converted to spaces.
  4. Character references (e.g. é) are interpolated.

Replies are listed 'Best First'.
Re^8: Scrubbing XML
by Jenda (Abbot) on Jun 05, 2011 at 00:34 UTC

    Nice. Looks like yet another example of the insanity of the XML specification authors. It's good to know XML parsers MUST corrupt the value of attributes. Really guys ... how many of you knew you have to escape your tabs and newlines when including the data in XML attributes? And which XML generators do that?

    Jenda
    Enoch was right!
    Enjoy the last years of Rome.