in reply to Re: HTML::Parser fun
in thread HTML::Parser fun

thanks for the info: I seem to recall testing HTML::Treebuilder and finding it lagging behind HTML::Parser in terms of performance (HTML::TokeParser::Simple was the worst performer, but easiest to use).

Our problem is that that performance penalty really becomes a problem when we're processing hundreds of millions of files...

Hence the choice of HTML::Parser. Now that I've got a taste of it's performance benefits, I'm loath to let go.