You can also have a look at the Module Reviews for XML modules and Ways to Rome, an article that solves the same problem using various XML modules.
The problem is that there is a lot of overlap between the various modules. Some cannot be used in certain circumstances, but for any particular problem there are at least 2 or more modules that will work. Basically it boils down to how much you like the interface of any module.
A quick overview would be:
- XML::Parser: the basic, most of the other modules are built on top of it, fast, low-level (can be a pain to use),
- XML::Simple: quite simple, robust, widely-used, tree-based (hence can be slow on big files and cannot deal with huge ones), does not work for document-oriented XML,
- XML::DOM: ugly, tree-oriented, widely used, not actively maintained at the moment, follows a W3C standard, can be a pain to install (BTW, if you are interested by the DOM I have started writing a little helper module for it, named... XML::DOM::Twig),
- XML::PYX: line-oriented, fast, not convenient for complex transformations,
- XML::XPath: powerful, getting faster and faster, very well supported (by Matt Sergeant, the most prolific XML developper around),
- XML::Twig: Perlish, DWIMy, can deal with huge documents, you know what I think of it ;--)
There are others too: XML::RAX for record-oriented XML, XML::Dt, XML::SimpleObjects...
In any case I think we're heading towards big changes in the XML module landscape. XML::Parser is not a SAX-based parser (it predates SAX actually), which is a pain, and it is quite a pain to install (based on expat, an external library). I think we will see new modules based either on a pure Perl SAX parser (there is one in SOAP::Lite) or on libXML, the Gnome XML library, plus existing modules being ported to interface with those 2 kinds of SAX parsers.
So I guess it will always be very difficult to give a "decision-tree" to choose a module, and in any case it is too early...
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.