If it is XHTML then you can use XML::Twig which may get you there faster. Alternatively take a look at the range of HTML parsers like HTML::TokeParser. Using regexen for parsing HTML is frought!
Try benchmarking different approaches using a representative, but very small test table. Maybe you could post a sample (small) table here with a description of the elements you need to pull out of the table so we can provide some real sample code for you to work from? Here's a template to get you started:
use strict; use warnings; use HTML::TableExtract; my $html = do {local $/; <DATA>}; my $te = HTML::TableExtract->new; $te->parse($html); __DATA__ <table></table>
Sometimes, if there is a lot of work to do, you just gotta do a lot of work!
In reply to Re: Speed Up HTML::TableExtract
by GrandFather
in thread Speed Up HTML::TableExtract
by Anonymous Monk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |