Parsing a SAX stream looks like this:
<p> <i> Italics <b> plus bold </b> </i> </p> start_document start_element : p start_element : i characters : Italics start_element : b characters : plus bold end_element : b end_element : i end_element : p end_document
The HSS stream looks like this:
start_document start_element : b content : plus Bold end_element : b start_element : i content : Italics content : <b>plus Bold</b> end_element : i start_element : p content : <i>Italics <b>plus Bold</b></i> end_element : p end_document
The reason for that, is my tag callbacks, which give you the ability to change a tag and its contents, delete the tag, and or its contents etc. The callback needs access to the contents/child nodes.
So the tree is built from the leaves down, rather than in the traditional manner.
I've implemented a solution, which I have posted here: RFC: HTML::StripScripts::LibXML
Clint
In reply to Re^2: Returning an XML::LibXML::DocumentFragment from HTML::StripScripts
by clinton
in thread Returning an XML::LibXML::DocumentFragment from HTML::StripScripts
by clinton
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |