in reply to Which XML parser would be the wisest to use

I'm not sure what XML::Rules is using under the hood, but how do you know XML::SAX is using XML::SAX::ExpatXS? It might be using the PurePerl Parser, and so would be extremely slow. The alternative to XML::Parser (which uses expat) is XML::LibXML (which uses libxml) at the low level. Everything else is just wrappers around those, and bound to be 'slower', but possibly easier to use for your specific problem.

Update: XML::Rules uses XML::Parser::Expat...but since it's a wrapper around Expat, it would be slower than the XML::Parser::Expat module alone...but faster/slower would not be the point of using XML::Rules over XML::Parser::Expat in the first place.

When I ran your XML::SAX code, it used XML::LibXML (via XML::LibXML::SAX) under the hood, but then, I have XML::LibXML installed, and ParserDetails.ini is set to use it.

  • Comment on Re: Which XML parser would be the wisest to use

Replies are listed 'Best First'.
Re^2: Which XML parser would be the wisest to use
by wardy3 (Scribe) on Feb 21, 2008 at 01:32 UTC

    Thanks, runrig

    I actually set

    $XML::SAX::ParserPackage
    in my SAX test and gave each of the modules a go. ExpatXS was the fastest, so I just used it.

    I thought XML::LibXML was a tree parser and I have had little luck with them as I run out of memory and my windoze session grinds to a halt :-(

    I quite like the feel of SAX and might just put up with the penalty but I wanted to get some opinions before just ploughing ahead and coding all the parsers I need.

    BTW, PurePerl took a lot longer. Here's the original run where I timed a few of the SAX modules :)

    XML::LibXML::SAX real 1m9.658s user 1m3.186s sys 0m0.312s XML::SAX::Expat real 2m29.873s user 2m2.421s sys 0m0.389s XML::SAX::ExpatXS real 0m55.370s user 0m49.014s sys 0m0.311s XML::LibXML::SAX::Parser real 2m31.700s user 2m14.342s sys 0m0.483s XML::SAX::PurePerl real 5m2.766s user 4m23.733s sys 0m0.515s
      XML::LibXML can be a purely SAX parser (XML::LibXML::SAX) if no DOM functions are used. XML::LibXML::SAX::Parser on the other hand says that it builds the DOM and then generates SAX events.

        I'm afraid of the

        At the moment XML::LibXML provides only an incomplete interface to libxml2's native SAX implementation. The current implementation is not tested in production environment. It may causes significant memory problems or shows wrong behaviour. If you run into specific problems using this part of XML::LibXML, let me know.
        note in the XML::LibXML::SAX's docs. I might have based XML::Rules on this module if it was not for the message. Until the module's authors remove the message I don't think I'll switch, though if anyone wants to attempt rewriting XML::Rules on top of SAX I don't mind and would do my best to help.