in reply to •Re: Web Site Mapper
in thread Web Site Mapper
Perhaps I overlooked something in WWW::Robot's documentation, but it appears to me that it wouldn't quite work for what it seems he wanted to do, since, as he said in the update, for his purposes he needed to ignore robot.txt rules, and I couldn't see any way to turn that off.
And on the second point, it appears that the sub will (almost) immediately return from a link that it had already hit that page, do it should stop once it has exhausted all pages that it hasn't already index, and finishes returning from a likely rather long list of links that it has already been to.
Perhap's it might be more efficent to check if a link already exists before invoking the sub's foreach?
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Re: •Re: Web Site Mapper
by hardburn (Abbot) on Feb 16, 2004 at 14:24 UTC |