Now, applying this knowledge, my feeling is that using Treebuilder would again hurt perfomance. Right?
Probably. I'm curious, how many HTML pages will you be parsing per second in your finished product?
| [reply] |
This questions does not exactly apply. ;-) But for the link-extraction part, Parser took around 0.08secs where Extor did 0.12 and Extractor 0.28secs (for average html).
Any help with my Parser question?
| [reply] |
This questions does not exactly apply. ;-) But for the link-extraction part, Parser took around 0.08secs where Extor did 0.12 and Extractor 0.28secs (for average html).
Okay, let me rephrase then... Why do you need such high performance for your project?
Any help with my Parser question?
No, sorry, I kind of promised myself to no longer waste time with HTML::Parser. It is a great module, but low level, and I can solve any HTML parsing problem much faster with higher level modules like HTML::TreeBuilder.
| [reply] |