in reply to XML::LibXSLT & --html flag?
Tinkster? Seriously? My name is Toby Inkster.
Anyway, the difference in times may be due to DTDs. By default libxml (and libxslt is all libxml-based) downloads DTDs and uses them to expand entities (i.e. convert é → é). This network activity significantly slows down parsing.
LibXML can thankfully be pointed at a local catalogue of DTDs. (See XML::LibXML::Parser and the load_catalog method.) This speeds it up significantly.
Also check out my module HTML::HTML5::Parser which (IMHO) parses HTML much better than libxml's built-in HTML parser.)
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: XML::LibXSLT & --html flag?
by Tinkster (Novice) on May 15, 2012 at 17:32 UTC |