Doesn't the error tell you that? | [reply] |
No, it doesn't. It does tell you that there is a resource unavailable but it doesn't say what resource or why it is unavailable. It may be a per-process resource unavailable because a per-process limit has been exceeded, or it may be some other resource or unavailable for some other reason.
Cycling the server processes after some reasonable number of requests will do little to identify what resource or why it is unavailable but it is easy to do and it will either be consistent with expectations or not - just another peice of slightly relevant information that can be added to the puzzle. It is similar to but not exactly the same as a graceful restart of the server, which we already know resets the counter (whatever form the counter to 30,000 takes).
update: Thinking of this further, it is odd that the limit would always be reached at exactly 30,000 requests if there is a per-process limit as there are effectively up to 6 server processes running (MaxClients 150 and ThreadsPerChild 25). Unless the 30,000 requests were allocated across the child processes strictly round robin or always to a single process, one would expect to succeed with a variable number of requests until, somewhat randomly, one of the child processes reached its per-process limit, after which one might expect intermittent results until all child processes had reached the limit. This suggests that the resource constraint is per user or system wide, perhaps, rather than per-process.
| [reply] |
I have tried setting MaxRequestsPerChild to different values and the outcome is that it takes much longer for the script to make the calls, but it will again start returning error 500 on the 30 thousandth request.
| [reply] |
So the constraint must be common to all the processes, not within or applied to each process separately. And, as you get the same result with various CGI scripts, it is unlikely that it is the CGI scripts that are consuming the resource. This implies that the server is consuming the resource (either the common server process or the child processes).
It has been suggested to run strace (I understand tusc is the equivalent on HP-UX) on the server to see what is happening when it fails. An alternative would be to attach a debugger to one of the failing server processes and walk through to the failing system call. It might help to be certain what system call is failing.
| [reply] |