Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

HTML::TokeParser Search and remove by ID

by djlerman (Sexton)
on Sep 09, 2014 at 18:54 UTC ( [id://1100012]=perlquestion: print w/replies, xml ) Need Help??

djlerman has asked for the wisdom of the Perl Monks concerning the following question:

I have come once again seeking perl wisdom

I am using HTML::TokeParser and this is what I would like to do.

  • Find containing div by ID. Loose everything before that
  • Find div by ID to be removed. This is within the containing div.
  • Copy to new string all contents of containing div to new string EXCEPT tags and contents of div by ID to be removed

I have read and searched for a solution, but TokeParser seems complex for my simple understanding.

my $parser = HTML::TokeParser->new(\$content); while (my $token=$parser->get_tag("div")) { if($token->[1]{id} =~ /containerID/i) { my $parser1 = HTML::TokeParser->new(\$token->as_is); while (my $token1=$parser1->get_tag("div")) { next if($token1->[1]{id} =~ /removeID/i); $out = ???; } } } var $newContent = "<html><head></head><body>". $out . "</body></head>" +;

Replies are listed 'Best First'.
Re: HTML::TokeParser Search and remove by ID
by Anonymous Monk on Sep 09, 2014 at 21:45 UTC

      Unfortunately we only have HTML::TokeParser installed and will not be able to get another package installed.

        Sorry, I mean we only have the following installed:
        • /usr/lib/perl5/HTML/PullParser.pm
        • /usr/lib/perl5/HTML/Entities.pm
        • /usr/lib/perl5/HTML/LinkExtor.pm
        • /usr/lib/perl5/HTML/Parser.pm
        • /usr/lib/perl5/HTML/Filter.pm
        • /usr/lib/perl5/HTML/TokeParser.pm
        • /usr/lib/perl5/HTML/HeadParser.pm

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1100012]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (7)
As of 2024-04-23 10:11 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found