As an alternative to the other excellent suggestions you might consider using HTML::TreeBuilder - this will give you a data structure whose elements you can extract individually so that you needn't worry about splitting up nodes such as <table> . You could take this another step and use the aforementioned tidy to turn your HTML to well formed XHTML and then use XML::DOM on it ;-}
/J\
In reply to Re: Truncating HTML early
by gellyfish
in thread Truncating HTML early
by nop
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |