Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Removing HTML beginning and ending tag with everything in between?

by djlerman (Sexton)
on Sep 01, 2011 at 20:03 UTC ( #923708=perlquestion: print w/replies, xml ) Need Help??

djlerman has asked for the wisdom of the Perl Monks concerning the following question:

I need to remove an HTML tag with everything in between. I thought the best way was to use HTML::TokeParser; or REGEX. In HTML::TokeParser; I can't figure out how to remove a specific tag as well as the content.

In REGEX I can't get search and replace to work. Example follows...

$content ="<div id="content"> BLa Bla Bla <div id='print'> *** TEXT TO BE REMOVED *** *** CODE TO BE REMOVED *** *** FORMATTING TO BE REMOVED *** </div> bla bla bla bla </div>"; $content =~ s/<div id='print'>(.*?)<\/div>//gis; print $content;
  • Comment on Removing HTML beginning and ending tag with everything in between?
  • Download Code

Replies are listed 'Best First'.
Re: Removing HTML beginning and ending tag with everything in between?
by ikegami (Patriarch) on Sep 01, 2011 at 20:23 UTC

    For that very specific text, your substitution does work. (There's a syntax error building your string because you use " as a delimited and you didn't escape the " characters within the string.)

    Using XML::LibXML (which has an HTML parser), it would be:

    for my $node ($root->findnodes('//div[@id="print"]')) { $node->parentNode()->removeChild($node); }
Re: Removing HTML beginning and ending tag with everything in between?
by Kc12349 (Monk) on Sep 01, 2011 at 20:52 UTC

    Take a look at your single quotes versus double-quotes. You have double around "content" and single around 'print'. Other than that my output for your regex code is below. Is this not what you are looking for?

    <div id="content"> BLa Bla Bla bla bla bla bla </div>

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://923708]
Approved by ikegami
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (6)
As of 2022-06-27 12:35 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My most frequent journeys are powered by:









    Results (88 votes). Check out past polls.

    Notices?