in reply to HTTP::Lite GET - too many requests?

Do you have your own website perchance? Though one obvious way (at least from that lead-in) is to try to duplicate the format, etc., and scrape against your own test site first, I'm not going to suggest that - my guess is that that is far more work than any other solution. Instead, I suggest looking at the logs. Check if Google or any other search engine has crawled your site. I bet you'll see delays between requests, which may point to a reasonable amount to sleep. I'm guessing it's about 1-5 seconds of sleep between fetches.

I'm betting you're simply hitting an automatic web-host DoS counter-attack: self-managed iptables that simply drop incoming packets from apparent DoS attackers. Whoever is hosting the journalist's site is blocking you directly. There are likely a dozen (or a hundred) other sites you're also temporarily blocked from, but you won't notice those ;-)

  • Comment on Re: HTTP::Lite GET - too many requests?