Thank you for this one; I forgot mentioning
HTML::Treebuilder, but I had it in mind too.
I hope to get the time to add the other variations too; once that done, I'll want to uniformize style about the different approaches and then comment a bit about pros or cons for each.
I'll probably avoid benchmarking (reasoning about
No More Meaningless Benchmarks!)- not sure yet. In the end, I hope to have collected together a few code samples that might be a goot reading for all those that step into the html parsing task.
I'm still looking to find a good title ("Parsing HTML' ?) and a good way to place the whole thing in the end. I'm tempted to make it a set of linked nodes in 'Code catacombs', but I'm unsure yet.
Thanks again.