For the first question, well is the oldest question in the monastery
How do I recursively process files through directories!
You can profit a non recursive solution by
tachyon Re: Win32 Recursive Directory Listing but also
Descending a directory tree, returning a list of files and
Recursive Directory print (where
Laurent_R explains why the
tachyon's solution is not recursive)
You only need to adjust a solution to meet your extension requirements.
For the html file processing you just need a sub to call for such files: the sub must load the content and process the HTML with one of the several modules aimed to do this kind of work (
HTML::Parser?
HTML::TokeParser::Simple?
HTML::TreeBuilder::XPath?). In fact very rarely you are allowed to parse HTML with regular expressions (even if you can).
HtH
L*
There are no rules, there are no thumbs..
Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.