in reply to Truncating HTML early
The best way to do this would be to use HTML::Parser to parse the entire document into a tree-like structure, useing hashrefs to store information about each element (elements includeing tags, and text). Then, useing recursion, go through the tree, printing out each element, and summing the number of words of plain-text printed. Then, at the top of the recursed sub, put an if statement, checking to see if the number of words is greater than a certain sum, or if you're in a table or any other such tag you want to specify.
- Silicon