Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Does anyone here have much experience using XML::SAX::PurePerl? I'm looking for a pure-perl (no c requirements whatsoever) xml parser. The docs warn that it is very slow, and buggy. Is it still usable? Are we talking Redhat buggy, or Microsoft buggy? Are there any better alternatives out there?

I also noticed a search on cpan turns up two XML::SAX::PurePerl, a 0.12v released in nov 2002, and a 0.80 released in nov 2001 - which one should I be using?

My reason for a pure-perl xml parser is to avoid any module installs and provide a single package for the job. Should I be looking at other options? Thank you for your input.

Replies are listed 'Best First'.
Re: XML::SAX::PurePerl experience
by grantm (Parson) on Jun 21, 2003 at 19:26 UTC

    Trying to avoid installing extra modules will most likely just cause you pain downstream. For example, I wouldn't recommend using the SAX PurePerl parser if your target Perl version is earlier than 5.8 since you won't have any encoding support.

    Possibly the easiest 'real' SAX parser to get going is XML::SAX::Expat. It uses XML::Parser which comes standard with ActivePerl on Win32, Linux and Solaris (+possibly HPUX?). For non-ActivePerl systems, most Linux distributions come with the expat library and it's easy to build on other Unixes so installing XML::Parser is usually no problem (as long as a compiler is installed).

Re: XML::SAX::PurePerl experience
by runrig (Abbot) on Jun 21, 2003 at 17:03 UTC
    Is it still usable? Are we talking Redhat buggy, or Microsoft buggy?
    It's usable depending on your needs, but as the docs say, it is very weak in some areas. The *PurePerl parser is really only for when there is absolutely no other alternative. My advice is to take the advice of the XML::SAX::PurePerl parser and install something else, like a parser based on libxml.
Re: XML::SAX::PurePerl experience
by Matts (Deacon) on Jun 22, 2003 at 10:42 UTC
    The only known bugs are in the area of processing DTD fragments and entities. I haven't fixed those yet because nobody has asked me to!
Re: XML::SAX::PurePerl experience
by graff (Chancellor) on Jun 23, 2003 at 03:47 UTC
    I guess one "downside" to the lib-based XML modules is that they seem to depend on something that you don't get from CPAN: James Clark's expat toolkit.

    Of course, that's an easy thing to get and easy to install -- it took me less than 5 minutes to google it, download it, compile it (just "make"), and move it to /usr/local/expat -- and it's presumably easier on any Windows box, since there are precompiled libs in the package. So having XML::Parser depend on expat is not a problem, I think. (In fact, I'd rather have expat as the guts for an XML parser in Perl -- James Clark is God in this domain.)

    For that matter, if people are going to be doing stuff with XML in general anyway, there's no good reason to avoid having "expat" available, just on general principles, whether you use Perl or other things on XML data.

      This reminds me - I'd like to be able to use entity references and inline DTDs but expat supports neither. Got any suggestions?

        Uh? Expat supports both entities and DTDs. Could you describe a little more what you think is missing?

        The only problem I am aware of is with (non-character) entities in attribute values (they disappear).