cleverett has asked for the wisdom of the Perl Monks concerning the following question:
I have a need to spider websites to look for certain HTML elements and determine their X/Y coordinates when rendered in a browser.
I can think of two approaches:
1. Use a client workstation and a custom web application to download the page, use javascript to find the HTML using DOM and then report the coordinates back and get the next page.
Pros: I can do this without a steep learning curve.
Cons: Ties up a workstation doing a batch job. Kludgey. Javascript.
2. I think Mozilla XPCOM can do this, and there even exists a Perl interface to XPCOM.
Pros: single program, single computer solution.
Cons: Don't know for sure that XPCOM can actually do this,
I have a question in over there to make sure. A wicked learning curve if it can.
My question is, have I missed an alternative approach?
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Unique spidering need
by Roger (Parson) on Jan 23, 2004 at 01:50 UTC | |
by cleverett (Friar) on Jan 23, 2004 at 05:28 UTC | |
by Roger (Parson) on Jan 23, 2004 at 05:45 UTC | |
by cleverett (Friar) on Jan 23, 2004 at 06:10 UTC | |
Re: Unique spidering need
by ViceRaid (Chaplain) on Jan 23, 2004 at 02:07 UTC | |
Re: Unique spidering need
by CountZero (Bishop) on Jan 23, 2004 at 07:12 UTC | |
by cleverett (Friar) on Jan 23, 2004 at 07:33 UTC | |
by merlyn (Sage) on Jan 23, 2004 at 13:28 UTC | |
by cleverett (Friar) on Jan 24, 2004 at 03:05 UTC | |
by herveus (Prior) on Jan 23, 2004 at 13:19 UTC |