What you no not want to do is parse XML/XHTML/HTML yourself. There are a number of great perl modules available to you. Take a look at HTML::Parser or XML::Twig as potential starting places. Getting your document into a meaningful data structure will vastly simplify the process of dealing with nested tags.