I would suggest HTML::Parser and a simple state machine, but it will be more than two or three lines. You might even be able to play some tricks with the ignore_elements, ignore_tags, report_tags, and skipped_text features to make the XS code do most of the filtering work. Then the handler callbacks simply print or discard the text as needed, or you can have the XS code stuff a parse trace into an array and use that later in your program.
In reply to Re: Looking for a module that strips an HTML tag and its associated 'TEXT'
by jcb
in thread Looking for a module that strips an HTML tag and its associated 'TEXT'
by nysus
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |