in reply to Beautifying some SGML?

In general, to pretty-print XML, I have used XML::Tidy and xml_pp (which is part of XML::Twig). Both Do What I Want in most cases.

However, I have discovered a corner case in which neither DWIW: For one type of Complex Elements in which elements that contain both other elements and text, I can't figure out how to get either module to indent the child element.

Your example ML falls into this category. If I use xml_pp on your text, it does not indent the "apple" tags, as I would expect. Perhaps the module's author could comment on this.

Replies are listed 'Best First'.
Re^2: Beautifying some SGML? (XML)
by mirod (Canon) on Dec 04, 2008 at 15:04 UTC

    It is a feature. As JavaFan mentioned before, adding whitespace is a bit risky, as you have to make sure that it is non-significant. So xml_pp adds line returns and indentation in-between tags if there is no other data in the element. This is not guaranteed to be perfectly safe, but as in general the DTD is unavailable, that's about the best it can do. As soon as there is non-whitespace data in the element, then no indentation is added. Because according to the XML spec, in that case the whitespaces ARE significant.

    It would be possible to add extra options to control more precisely the behavior of xml_pp, but that would be quite tricky, and error prone. At this point it's easier to write a custom pretty printer for your data. At least it's easier for me! ;--)