in reply to Recommendation on a module for HTML/XML extraction.
Despite the name, XML::LibXML can also handle HTML. tangent++ proposed to use XPath, that's also supported by XML::LibXML.
Alexander