Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

Nice benchmark. I wouldn't use XML::Parser's Stream style, but it's probably because I am not very familiar with it.

I expanded slightly this benchmark, creating a somewhat more complicated document, still around 3M and 10K elements, and run a bunch of modules on it.The results are quite surprising actually:

10160 elements generated - (63 top level - 1721 to extract)
bench_regexp             : 0:00.16 real 0.14  0.03 s
bench_libxml             : 0:00.44 real 0.39  0.05 s
bench_parser             : 0:00.88 real 0.83  0.01 s
bench_parser_stream      : 0:01.15 real 1.10  0.06 s
bench_twig               : 0:01.84 real 1.81  0.03 s
bench_sax_base_libxml    : 0:03.29 real 3.25  0.05 s
bench_sax_libxml         : 0:03.32 real 3.31  0.03 s
bench_sax_expat          : 0:03.21 real 3.11  0.03 s
bench_dom                : 0:04.51 real 4.41  0.03 s
libxslt                  : 0:01.48 real 1.46  0.02 s
xml_grep                 : 0:02.07 real 2.02  0.03 s

I am very surprised by how slow the XML::SAX examples are (hence I wrote one using SAX::Base and 1 not using it). I did not expect this, and I will try to figure out what the problem is. If you look at the code, I really don't think I am using the PurePerl parser, I took great care of creating the parser myself. That's odd.

Code and everything to run it is at

Get simple_benchmark.tar.gz

tar zxvf simple_benchmark.tar.gz cd simple_benchmark perl run_all

Note that the xml_grep version only works with the latest, greatest release of the tool, available somewhere else on the same site (with the development version of XML::Twig).

In reply to Re: Re: xml parsers: do I need one? by mirod
in thread xml parsers: do I need one? by regan

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or or How to display code and escape characters are good places to start.
Log In?

What's my password?
Create A New User
Domain Nodelet?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (6)
As of 2022-08-16 07:22 GMT
Find Nodes?
    Voting Booth?

    No recent polls found