Re: RFC: LWP::UserAgent hit counter

Replies are listed 'Best First'.
Re^2: RFC: LWP::UserAgent hit counter by bliako (Abbot) on Jun 04, 2018 at 10:38 UTC
Hi, this is interesting, thanks! That said let me clarify a bit more on my situation: i prefer to handle the throttle myself. For example, sometimes I will get a server timeout, in which case i will repeat my hit but after sleeping for some longish time (because I know that they are probably doing a backup as it occurs at more or less the same time). Whereas normally I sleep for shorter times in a loop. Some pages I access less often and I would loop over them with a very small sleep value, some other pages I access more frequently and the sleep time must be longer. Most importantly, I need my sleeps to be variable, seemingly random. Right now, they come out from a random distribution with a mean and a standard deviation which I control. From the source code of the package you mentioned it looks that it overrides the `send_request()` method so that it sleeps for a FIXED amount of time and then it does the request. The throttle value (sleep seconds) can be replaced by a throttle function which returns a random number of seconds to sleep drawn from a statistical distribution. That can be useful. However, my need for different throttles on different situations (i.e. GET/POST requests to the same site and not just different websites) still exists.	[reply] [d/l]

Replies are listed 'Best First'.

Re^2: RFC: LWP::UserAgent hit counter
by bliako (Abbot) on Jun 04, 2018 at 10:38 UTC

Hi, this is interesting, thanks!

That said let me clarify a bit more on my situation: i prefer to handle the throttle myself. For example, sometimes I will get a server timeout, in which case i will repeat my hit but after sleeping for some longish time (because I know that they are probably doing a backup as it occurs at more or less the same time). Whereas normally I sleep for shorter times in a loop. Some pages I access less often and I would loop over them with a very small sleep value, some other pages I access more frequently and the sleep time must be longer.

Most importantly, I need my sleeps to be variable, seemingly random. Right now, they come out from a random distribution with a mean and a standard deviation which I control.

From the source code of the package you mentioned it looks that it overrides the send_request() method so that it sleeps for a FIXED amount of time and then it does the request. The throttle value (sleep seconds) can be replaced by a throttle function which returns a random number of seconds to sleep drawn from a statistical distribution. That can be useful. However, my need for different throttles on different situations (i.e. GET/POST requests to the same site and not just different websites) still exists.

[reply]
[d/l]