You can extract tabular data from HTML files, XML files, CSV files, Fixed Width Files, and many other formats using AnyData (a tied-hash interface) or DBD::AnyData (a DBI/SQL interface) modules. For HTML tables, both use the excellent HTML::TableExtract module that b10m mentioned.