in reply to regex and HTML

Try using one of the HTML parsing modules such as HTML::Parser or HTML::TokeParser to do this. Parsing HTML with regexen is a perilous endeavor. You will get your project done much faster and with far fewer errors if you take the virtuous route (i.e., lazy) route and use one of these modules.

----
Coyote