in reply to Perl xml parser

I agree with everyone's replies above, especially that you should read up on XML::Simple first.

In my experience, XML::LibXML is very fast, but you should only care about this if you are parsing a lot of data (say, hundreds of kilobytes). Installing it is a bit trickier than other XML modules since it has more dependencies, and is pretty sensitive about versions of its dependencies, but I thought it's worth mentioning because I think it's relatively unknown and, well, because it *is* fast :)

Replies are listed 'Best First'.
Re^2: Perl xml parser
by Aristotle (Chancellor) on Aug 02, 2004 at 16:43 UTC

    XML::LibXML also offers much simpler and clearer interface than XML::Parser. I'd never use XML::Parser directly; at the least, you need XML::Twig to get any work done.

    Another serious reason in favour of libxml, which is much more practically important than its performance, is that consumes much less memory — a huge win when you're working with larger XML files. (It's no fun when your Perl processes grow to 600MB to generate a mere 40MB XML file.)

    On unixoid systems, the dependencies are not likely to be a problem either. I don't know if there's a PPM for Windows people, but if so, that should work with a minimum of fuss for them.

    Makeshifts last the longest.

      Hmmm, maybe the performance increases I saw were memory related. For some reason I had not thought to look at that! Moving from XML::Twig to XML::LibXML gave me a 36% speedup in a function that was building a complex structure from raw XML data.

      I've gotten the module to work with Cygwin (perl and libs). I did need to make use of rebaseall to get my DLLs in order, though; plus there's a tacit dependency on libiconv (tacit == nobody tells you this, but the module fails to build).

      On Solaris I encountered two problems. The first was that http://www.sunfreeware.com/ packages a version of libxml2 that is incompatible with the Perl bindings. I solved that (painfully) by making a package of my own. (If anyone wants a copy, give me a holler; but I cannot support it.) The second was a trivial compilation error in one file (was it memory management?) in the bindings plus pesky configuration minutiae. I sent a patch to the maintainers and I don't know if it's been incorporated since.