I've had success looking at HTML tags and content of tags, on a as needed basis, by grabbing pages with LWP::UserAgent and parsing the result with HTML::TreeBuilder or HTML::TableExtract. With the TreeBuilder module you get a tree of your HTML returned and you can walk down the tree and choose what you want to extract. TableExtract is a more specific parser just for tables.