Xenofur has asked for the wisdom of the Perl Monks concerning the following question:

This was answered quickly. XML::Simple + XML::Parser is the answer.

-----

I'm currently trying to parse XML files like this one http://api.eve-central.com/api/quicklook?typeid=24312 in such a manner that in the end i have two arrays of hashes for sell orders and buy orders, so i can dump said orders into a database.

For various reasons I'd like to stick to modules that can be installed via ActivePerl in a simple manner, but also don't take ages to do the processing.

So far i have tried: Do you have any other sggestions?
  • Comment on Fastest XML parser that can run under ActivePerl on Windows?

Replies are listed 'Best First'.
Re: Fastest XML parser that can run under ActivePerl on Windows?
by ikegami (Patriarch) on Apr 26, 2009 at 22:42 UTC

    XML::Simple doesn't do any parsing. It uses one of many parsers. Many users of XML::Simple use XML::SAX::PurePerl as their parser, the slowest of the bunch. (And it has problems with encodings.) According to my tests (which I don't have handy), XML::Simple is fastest when you have it use XML::Parser. XML::LibXML is much faster yet.

      Wow, thanks a lot. That works perfectly. :)

      Edit: XML::Parser, i mean. With LibXML it only complains about the ParserDetails.ini missing and still runs slowly.
        I didn't know XML::LibXML could be used with XML::Simple. I didn't benchmark that.
Re: Fastest XML parser that can run under ActivePerl on Windows?
by almut (Canon) on Apr 26, 2009 at 23:05 UTC
    XML::Bare : doesn't install

    Just wondering why that is...  The core of the module is a rather straightforward piece of C code, which doesn't appear to be using any fancy constructs available only with one specific compiler.  And the Makefile.PL does have entries for Win32/msvc.  IOW, I guess it should work...

      FWIW, the error is this:
      CPAN.pm: Going to build C/CO/CODECHILD/XML-Bare-0.43.tar.gz 'x' outside of string in unpack at Makefile.PL line 28. (C:\Perl\bin\perl.exe Makefile.PL INSTALLDIRS=site exited with 512) CPAN::Reporter: Makefile.PL result is 'unknown', Stopped with an error +.
      I didn't try looking into how to fix this, as any local fix wouldn't be useful to users of my script elsewhere.

        I can replicate the problem with Perl 5.10.0 (it's fine with 5.8.8).  The respective snippet is (with the debug line added):

        my $ver = $]*1000; # correct for possibile division problems print "\$]=$], \$ver=$ver\n"; # debug my ($major,$minor,$sub) = unpack("AA3xA3","$ver");

        Output 5.8.8:

        $]=5.008008, $ver=5008.008

        Output 5.10.0:

        $]=5.010000, $ver=5010 'x' outside of string in unpack at ./760220.pl line 10.

        (note the missing .XXX part in $ver, which makes unpack() complain)

        Apparently, this has never been tried with a .0 release...   (Update: bug reported)

Re: Fastest XML parser that can run under ActivePerl on Windows?
by ruzam (Curate) on Apr 27, 2009 at 02:21 UTC
    If your XML limitations are known and your after a pure Perl parser I've found XML::Tiny useful. I've also written a tiny parser of my own very similar to XML::Tiny. Mine is faster and returns results in a format similar to XML::Simple (less nesting of data, which I think accounts for the speed up over XML::Tiny).
      My concern isn't about pure Perl, but being able to install it via ppm out of the box. :)

      XML::Tiny isn't bad, but XML::Simple + XML::Parser is still 50% faster.
        My concern isn't about pure Perl, but being able to install it via ppm out of the box.

        The Active State repository isn't the only place to get ppm modules. See this page: http://ppm4.activestate.com/

        If you are using the gui I/F to PPM, got into Edit | preferences and there is a way to add other repositories. Sometimes you will find something that isn't in the default ActiveState repository (I've gotten crypt modules from U of Winnipeg before). I don't know for sure whether this will help you on your XML quest or not, but thought it was worth a mention.

Re: Fastest XML parser that can run under ActivePerl on Windows?
by ambrus (Abbot) on Apr 28, 2009 at 14:25 UTC

    Try ruby's rexml module. After that, you'll find LibXML or Twig blazing fast.

Re: Fastest XML parser that can run under ActivePerl on Windows?
by Jenda (Abbot) on May 01, 2009 at 08:56 UTC

    Try XML::Rules. With that you can either filter the data to contain only the stuff you need and end up with those arrays or insert the individual orders into the database as they are parsed and save memory. Uses XML::Parser and installs with PPM.

    Jenda
    Enoch was right!
    Enjoy the last years of Rome.