mldvx4 has asked for the wisdom of the Perl Monks concerning the following question:
I'd like to replace selected paragraphs in place but am not figuring out the right way of doing so. Specifically I'd like to replace a specific, single P with multiple P at the same point in the tree. I've tried many dozen variations of the below code, but what I have gives an error, "the target node's parent has no content!?"
I don't understand. The replace_with_content or push_content methods should have established something for postinsert to add to. Clearly I have missed something? What though?
#!/usr/bin/perl use HTML::TreeBuilder::XPath; use warnings; use strict; &readfile; exit(0); sub readfile { my ($file)= (@_); my $xhtml = HTML::TreeBuilder::XPath->new; $xhtml->implicit_tags(1); $xhtml->no_space_compacting(1); $xhtml->parse_file(\*DATA) or die(); # find double-spaced paragraphs inside blockquotes and expand them for my $p ($xhtml->findnodes('//blockquote/p')) { my $text = $p->as_text(); $text =~ s/^\s+//; $text =~ s/\s+$//; next unless($text =~/\n\s*\n\s*/); my @paragraphs = split(/\s*\n\s*/, $text); print qq(\t\@paragraphs=),join(',',@paragraphs),qq(\n); if ($#paragraphs >= 0) { my $pp = shift(@paragraphs); print qq(\t\tpp1=$pp\n); $p->replace_with_content(); $p->push_content(['p',,$pp]); print qq(Identified :\n); print qq(«),$p->as_XML_indented,qq(»\n); foreach $pp (@paragraphs) { print qq(\t\tpp2=$pp\n); $p->postinsert(['p',,$pp]); } } } print qq(\n),qq(-)x30,qq(\n); my ($body) = $xhtml->findnodes('//body'); print qq(\n); print $body->as_XML_indented; $xhtml->delete; return (1); } __DATA__ <body> <blockquote id="one"> aaa bbb ccc </blockquote> <blockquote id="two"> <p> ddd eee fff </p> </blockquote> <blockquote id="three"> <p> ggg </p> <p> hhh </p> <p> iii </p> </blockquote> <blockquote id="four"> <p> jjj </p> </blockquote> </body>
The expected output would be for BLOCKQUOTE number two to contain three separate paragraphs instead of one (or four). The other P in the other BLOCKQUOTE elements should continue to be left alone, as the script currently does.
|
|---|