in reply to html analysis tool via regex

Try XML::XSH for an interactive shell that lets you navigate through the nodes of XML and HTML documents.