Re^2: can't extract node with HTML::TreeBuilder::XPath

Replies are listed 'Best First'.
Re^3: can't extract node with HTML::TreeBuilder::XPath by saunderson (Novice) on Jul 30, 2012 at 11:27 UTC
What does HTML::TreeBuilder do? Who knows!? I KNOW! It tells you to read the source, how awful :) I second that. A specs compatible HTML::TreeBuilder::XPath that works with the xpaths extracted with a common browser would definitely a simplification....	[reply]
Re^4: can't extract node with HTML::TreeBuilder::XPath by Anonymous Monk on Aug 01, 2012 at 03:34 UTC
I second that. A specs compatible HTML::TreeBuilder::XPath that works with the xpaths extracted with a common browser would definitely a simplification.... I was being sarcastic :) HTML::HTML5::Parser isn't documented much better than HTML::TreeBuilder -- you have to read the source just the same FYI, HTML::TreeBuilder::Xpath just tacks on an xpath-1 engine onto a TreeBuilder tree -- common browser addons commonly modify the DOM --- its usually only @class and @id attributes you're interested in , not absolute paths htmltreexpather.pl works with the actual tree that HTML::TreeBuilder builds, no browser required :)	[reply]
Re^5: can't extract node with HTML::TreeBuilder::XPath by tobyink (Canon) on Aug 01, 2012 at 06:35 UTC
Or you could read the HTML5 specification which it almost perfectly complies with. That's the whole point of it - it doesn't need to document how it parses HTML, because it parses it per spec, and the same way as almost every modern browser. `perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'`	[reply]
Re^6: can't extract node with HTML::TreeBuilder::XPath by Anonymous Monk on Aug 01, 2012 at 07:15 UTC
Re^7: can't extract node with HTML::TreeBuilder::XPath by tobyink (Canon) on Aug 01, 2012 at 10:04 UTC
Some notes below your chosen depth have not been shown here