I have a need to spider websites to look for certain HTML elements and determine their X/Y coordinates when rendered in a browser.
I can think of two approaches:
1. Use a client workstation and a custom web application to download the page, use javascript to find the HTML using DOM and then report the coordinates back and get the next page.
Pros: I can do this without a steep learning curve.
Cons: Ties up a workstation doing a batch job. Kludgey. Javascript.
2. I think Mozilla XPCOM can do this, and there even exists a Perl interface to XPCOM.
Pros: single program, single computer solution.
Cons: Don't know for sure that XPCOM can actually do this,
I have a question in over there to make sure. A wicked learning curve if it can.
My question is, have I missed an alternative approach?
In reply to Unique spidering need by cleverett
For: | Use: | ||
& | & | ||
< | < | ||
> | > | ||
[ | [ | ||
] | ] |