in reply to XML Module Recommendations
In terms of general resourcesyou can have a look at the Perl and XML FAQ and Kip Hampton's XML.com column.The Module Reviewson this site also include quite a few nodes about XML modules.
More specifically, I don't think there is any module that will satisfy all your criteria but lets list the main candidates amongst the tree based modules:
- XML::Parser: available everywhere (comes standard with Activestate Perl, as it is used by PPM, but you need to install expat separately on *nix)), low-level (usually used to more build convenient modules), has a Tree Style that gives access to the whole document at once but no one seems to like it (or even use it),
- XML::Simple: available through PPM,based on XML::Parser, can be used only for data-oriented XML (no mixed-content), loads the XML into a Perl structure,
- XML::Twig: based on XML::Parser, no PPM available, see the FAQ for instructions about installing it on Windows, mixed event-tree mode, I like it (but I also wrote it ;--),
- XML::DOM: based on XML::Parser, my only take on it is that the DOM is NOT appropriate for general purpose XML transformation, it gives you plenty of rope... avoid it,
- XML::LibXML: based on libxml2, which needs to be installed, but a really nice module,which gives you SAX, DOM and XPath (the addition of XPath makes the DOM usable).
Those are the main tree-based modules, all of the SAX modules work are event-based. BTW XML::SAX::PurePerl would probably be too slow for a 10M file so it is likely that you might no be able to use a pure Perl solution.
In the end I would think that XML::Twig (surprise ;--) or XML::LibXML are the best choices, unless you can use XML::Simple. It also depends on the kind of XML you are dealing with (data or document).