Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks,
I'm fairly new to Perl and started recently to learn Perl to parse and work with XML files. Pretty neat and easy.
I'm currently using ::LibXML and having formatting problem. I am adding new node to an existing node using

$old_node->addChild($new_node);

worked fine but the formatting of each file is different for example my output would look something like:

<root_tag> <element1>text</element1> #space <element2>text</element2> #space <element3>txt</element3> <new_node> <new_element1>text</new_element1> <new_element2>text</new_element2><new_element3> </new_element3> </new_node> </root_tag>

in the above example my existing XML file has spaces between elements, where my new source file doesn't i.e <new_node>.
also the tags gets missed up when copied i.e <new_element2>text</new_element2><new_element3>
indentation is also not precise. Im outputting my results to a file using
$new_object->toString(2;);
$new_object->toFile('file_name.xml');
so in summary the question is whats the best way when using LibXML to get a nice formatted XML output.

Replies are listed 'Best First'.
Re: XML output formatting
by Discipulus (Canon) on Dec 10, 2013 at 08:48 UTC
    Hello, if you wont a granular control over spacing you can enjoy the ability of another module: XML::Twig that has a lot of nice controls about spacing, prettify and indentation.
    Well, if have learned yet LibXML, and you wont only refine the spacing, pipe the output to this oneliner:
    cat xmlpmonks.xml | perl -ne "use XML::Twig; $/=''; $t=XML::Twig->new( +keep_spaces=>1); $t->parse(<>) or die $!; $t->print;"
    As you can see i'm using keep_spaces but there are also: discard_spaces, discard_all_spaces, discard_spaces_in and keep_spaces_in.

    Hth L*
    There are no rules, there are no thumbs..
    Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
Re: XML output formatting (XML::LibXML::PrettyPrint)
by Anonymous Monk on Dec 10, 2013 at 06:28 UTC

    so in summary the question is whats the best way when using LibXML to get a nice formatted XML output.

    IMHO, the best way is to not worry about it, libxml doesn't care :p

    Also, xml_pp, XML::LibXML::PrettyPrint

      XML is a data format, rather than a 'display' format. You can coerce it to 'look pretty' but I'd suggest you're thinking the wrong way if you do. You don't view raw HTML, and expect it to look 'nice', because that's the point of having formatting tags in the first place.

      Consider instead using an XSL style sheet - there's some generic ones that exist, or you can put together your own for your data. It's essentially a set of instructions that transform 'XML tags' into 'format' e.g. much the same way as your browser does with the well defined HTML tags. (HTML being a subset of XML in the first place).

        ... not op ... ... HTML being a subset of XML in the first place

        First came SGML then came HTML then came XML ... is HTML a subset of XML? Not really

Re: XML output formatting
by locked_user sundialsvc4 (Abbot) on Dec 10, 2013 at 15:53 UTC

    The XML file-format cares nothing for spaces, and usually does not include them, just to save bytes.

    As this O’Reilly article shows, XSLT can be used both to extract relevant parts of an XML document, and to pretty-print the extracted data.   No programming is required (except to the extent that XSLT is a form of programming ...).

    If you are using LibXML, then all of these capabilities are at your beck and call.

Re: XML output formatting
by roboticus (Chancellor) on Dec 10, 2013 at 22:47 UTC

    As others have suggested, I'd suggest you don't worry about the formatting of the XML document with respect to simply using it.

    However, if you're trying to understand a file, it's frequently nice to have it formatted. I use xmllint --format inputfile >outputfile to reformat an XML document when I need to open it in an editor just to check it out structurally.

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.