Think about Loose Coupling

Re: Re: xml parsers: do I need one?

by mirod (Canon)
on Aug 29, 2003 at 19:36 UTC

in reply to Re: xml parsers: do I need one?
in thread xml parsers: do I need one?

Nice benchmark. I wouldn't use XML::Parser's Stream style, but it's probably because I am not very familiar with it.

I expanded slightly this benchmark, creating a somewhat more complicated document, still around 3M and 10K elements, and run a bunch of modules on it.The results are quite surprising actually:

10160 elements generated - (63 top level - 1721 to extract)
bench_regexp             : 0:00.16 real 0.14  0.03 s
bench_libxml             : 0:00.44 real 0.39  0.05 s
bench_parser             : 0:00.88 real 0.83  0.01 s
bench_parser_stream      : 0:01.15 real 1.10  0.06 s
bench_twig               : 0:01.84 real 1.81  0.03 s
bench_sax_base_libxml    : 0:03.29 real 3.25  0.05 s
bench_sax_libxml         : 0:03.32 real 3.31  0.03 s
bench_sax_expat          : 0:03.21 real 3.11  0.03 s
bench_dom                : 0:04.51 real 4.41  0.03 s
libxslt                  : 0:01.48 real 1.46  0.02 s
xml_grep                 : 0:02.07 real 2.02  0.03 s

I am very surprised by how slow the XML::SAX examples are (hence I wrote one using SAX::Base and 1 not using it). I did not expect this, and I will try to figure out what the problem is. If you look at the code, I really don't think I am using the PurePerl parser, I took great care of creating the parser myself. That's odd.

Code and everything to run it is at

Get simple_benchmark.tar.gz

tar zxvf simple_benchmark.tar.gz cd simple_benchmark perl run_all

Note that the xml_grep version only works with the latest, greatest release of the tool, available somewhere else on the same site (with the development version of XML::Twig).

Replies are listed 'Best First'.
Re: Re: Re: xml parsers: do I need one?
by samtregar (Abbot) on Aug 29, 2003 at 21:27 UTC
    Wow, and I thought I was going overboard! Very interesting. I'd like to see how Xerces/C++ fairs too, but I still can't build XML::Xerces. I've been using the DOMCount example program to do XML Schema validation...


