in reply to Best practice: How to split HTML into paragraphs?
Use HTML::TreeBuilder. :-D
The two pieces of advice you quote are particularly apposite when parsing mark up such as HTML. If you were really worried about "does too much" would you be using Perl?
The time that you spend figuring out how to use TreeBuilder to do the job will be much less than the time you would spend trying to rewrite the parts of HTML::Parser, HTML::Element and HTML::TreeBuilder that are involved in doing the work you need done.
|
|---|