I'm able to process WordPress' inexcusably messy output using HTML::TreeBuilder::XPath. However when I try to call the delete method on the object I get warnings or errors. I would like the problem to go away but neither have access to the site producing the document nor to WordPress upstream which is actually where the fault lies. Anyway, when I call delete after otherwise successful processing of the document I get the following error:
Deep recursion on subroutine "HTML::Element::delete" at /usr/share/perl5/HTML/Element.pm line 567. Deep recursion on subroutine "HTML::Element::delete_content" at /usr/share/perl5/HTML/Element.pm line 580.
Can I just use undef instead of calling the delete method? Or is there another approach which is better? I use HTML::TreeBuilder::XPath extensively in the real script but maybe could or should do some pre-processing with a different parser though I'd rather not.
Here is a stripped down example of the problem:
#!/usr/bin/perl use HTML::TreeBuilder::XPath; use strict; use warnings; my $ent = HTML::TreeBuilder::XPath->new; $ent->parse_file(\*DATA); $ent->delete; exit(0); __DATA__ <html> <head> <title>foo bar</title> </head> <body> foo <br /> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <center> <strong>bar</strong> <br /> <center>(baz)</center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </center> </body> </html>
In reply to Unnesting deeply nested HTML elements (Deep recursion on subroutine "HTML::Element::delete") by mldvx4
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |