However, I am not sure how to apply a substitution, s///, to an element containing more that just text.
The documentation of HTML::Element's content_refs_list gives an example of how to modify text nodes contained in an element and the documentation of HTML::Element::traverse shows how to use a recursive function to walk the tree. Putting those together:
sub html_trim { my $elem = shift; for my $itemref ($elem->content_refs_list) { if ( ref $$itemref ) { html_trim($$itemref) } # remove this for non-recursive else { $$itemref =~ s/^\s+|\s+$//g } } } for my $elem ($xhtml->findnodes('//div/ul/li')) { html_trim($elem) }
In reply to Re^3: Avoiding escaped child elements with HTML::TreeBuilder::XPath or HTML::Element
by haukex
in thread Avoiding escaped child elements with HTML::TreeBuilder::XPath or HTML::Element
by mldvx4
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |