in reply to Re: Sanctifying Modules
in thread Sanctifying Modules

While on the subject of XML don't forget Enno Dirksen's XML::DOM. The XML's Document Object Model (DOM) is a standard way of interfacing with XML. It varies little from Perl to Java to C++.

XML::Twig is a simplified and more efficient way of interfacing with the XML::Parser, and is written by our own mirod. It is better than XML::DOM for many uses. However, an advantage of learning to use XML::DOM is that one's knowledge of the DOM is transferable to other languages which also use XML's DOM.

Replies are listed 'Best First'.
Re: Re: Re: Sanctifying Modules
by mirod (Canon) on Feb 12, 2001 at 16:34 UTC

    I cannot really recommend XML::DOM at this time.

    It might be standard, robust and widely used, but it is not supported anymore (Enno seems to have disappeared from the surface of the Earth) and it is difficult to install, as the latest version is hidden in a bundle (libxml-enno), and has not been updated to be compatible with XML::Parser.

    This could improve in the near future as it looks like someone has volonteered to take over its maintenance (although destroying compatibility with older versions of XML::Parser in the process).

    Beyond the hopefully temporary lack of support, XML::DOM is not only ugly, I also think it is dangerous. The level of the API is so low that using the DOM directly results in code that can be broken really easily.

    A good example is that you cannot go simply from an element node to the next one. If there is a comment before the element you want the getNextSibling method will happily return it to you. Of course it is easy to write a getNextElement method, and even to add an extra parameter so you can specify the tag of the element you want, but then you end up using a non-standard API and, from the examples I see floating around the web, I don't think too many people go through that trouble. I am fairly sure that most DOM code that does not use an extra (non-standard) layer of methods on top of the standard ones are vulnerable to inserting (legal) XML comments in the documents.

    The fact that the only practical way to access elements is the getElementsByTagName also means that it is used everywhere in a DOM program, despite being really inefficient.

    I have not used XML::XPath but I would think that it is better designed than the DOM, better supported than XML::DOM, and more efficient as it now uses a C library for managing nodes.