in reply to Re: Re: parallel downloading
in thread spidering, multi-threading and netiquette
The best way to ensure you have reasonable delay's in your requests, is to use a User Agent that enforces those delays, ala: LWP::RobotUA and LWP::Parallel::RobotUA
You should also keep in mind, that just because th server doesn't have a robots.txt today, doesn't mean it won't have one tomorow ... so make sure your code checks for it each time it's run: WWW::RobotRules.
|
|---|