The objectify_text call just seems to invert the problem. Though I can be rather obtuse and may not see the right way to use it.
I might be able to fit XML::LibXML into the full script and replace HTML::TreeBuilder::XPath. Here is my sketch,
#!/usr/bin/perl use XML::LibXML; use strict; use warnings; my $tree = XML::LibXML->load_xml(IO => \*DATA); my $dtd = XML::LibXML::XPathContext->new( $tree->documentElement() ); $dtd->registerNs( 'u' => 'http://www.w3.org/1999/xhtml' ); for my $body ($dtd->findnodes('//u:body')) { # print $body->toString; for my $n ($body->childNodes()) { print $n->toString; } } print "\n"; print "OK\n"; exit(0); __DATA__ <!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta name="generator" content= "HTML Tidy for HTML5 for Linux version 5.6.0" /> <title></title> </head> <body> <p>foo</p> <p>bar</p> trololo </body> </html>
In reply to Re^2: Seemingly Valid HTML which crashes HTML::TreeBuilder::XPath
by mldvx4
in thread Seemingly Valid HTML which crashes HTML::TreeBuilder::XPath
by mldvx4
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |