Re: Apache / 30,000 Perl script serving limit
by almut (Canon) on May 06, 2009 at 14:59 UTC
|
... Subsequent calls give 500 Internal Server Error.
In case of an Internal Server Error there's usually some corresponding message in the webserver's error log, which in this case might give a clue as to what exactly is enforcing the limit. If you find nothing there, you might also want to look in the syslog, etc.
| [reply] |
|
|
| [reply] [d/l] [select] |
|
|
A quick googling found this for me. I haven't investiged this any further yet, but it seems to hint at the "maxuprc" kernel parameter, or some related setting. Check with /usr/sbin/kmtune what your settings are...
| [reply] [d/l] |
|
|
|
|
rc = ap_os_create_privileged_process(r, procnew, command, argv, en
+v,
procattr, p);
if (rc != APR_SUCCESS) {
/* Bad things happened. Everyone should have cleaned up. */
ap_log_rerror(APLOG_MARK, APLOG_ERR|APLOG_TOCLIENT, rc, r,
"couldn't create child process: %d: %s", rc,
apr_filename_of_pathname(r->filename));
}
If you dig a bit deeper, you'll find that eventually fork() is being called (unsurprisingly), i.e.
if ((new->pid = fork()) < 0) {
return errno;
}
in ./srclib/apr/threadproc/unix/proc.c, in the
routine apr_proc_create().
The errno eventually ends up in rc, which is being reported as
11 (numerically), or as "Resource temporarily unavailable" (text form).
The corresponding symbolic form is EAGAIN, which - if you look in
HP-UX's fork(2) manpage - is being returned under these two circumstances:
ERRORS
If fork() fails, errno is set to one of the following values.
[EAGAIN] The system-imposed limit on the total number
+ of
processes under execution would be exceeded.
[EAGAIN] The system-imposed limit on the total number
+ of
processes under execution by a single user w
+ould
be exceeded.
The former limit is the "nproc" setting (unlikely to be the cause
here, as you can still run other programs); the latter limit is the
already mentioned "maxuprc" tunable. In other words, I'd say the
theory fits too well to be ruled out completely, yet... :)
Monitoring the OS we found the total number of process used every
each hour(at XX:00) was never exceeding 10 and represented about 0.9%
usage.
How exactly did you investigate this? Are you sure you don't
have any zombies lingering around, or some such. What do ps and top say
when the limit has been reached? Is the limit exactly 30000, or maybe 30K, with K being 1024? Do you really need to reboot the machine, or is simply restarting Apache sufficient? (use stop, start - not restart or graceful - to be sure to actually get a new process for the Apache parent)
| [reply] [d/l] [select] |
|
|
|
|
|
|
|
|
|
|
|
A common cause is forking lots of children without reclaiming the zombies using wait or waitpid.
| [reply] |
|
|
Re: Apache / 30,000 Perl script serving limit
by BrowserUk (Patriarch) on May 06, 2009 at 17:49 UTC
|
My guess, (offered only because you do not seem to be getting a lot of other leads), is that you are running out of file handles?
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] |
Re: Apache / 30,000 Perl script serving limit
by derby (Abbot) on May 06, 2009 at 17:53 UTC
|
| [reply] |
|
|
environment.pl doesn't do much, it lists environment variables and their values, but I tested with many other scripts we have, such as language change scripts, survey engines, w3c validator, etc.. all results in the same error at the same iteration.
| [reply] |
|
|
LoadModule access_module modules/mod_access.so
LoadModule auth_module modules/mod_auth.so
LoadModule include_module modules/mod_include.so
LoadModule log_config_module modules/mod_log_config.so
LoadModule env_module modules/mod_env.so
LoadModule setenvif_module modules/mod_setenvif.so
LoadModule mime_module modules/mod_mime.so
LoadModule status_module modules/mod_status.so
LoadModule cgid_module modules/mod_cgid.so
LoadModule dir_module modules/mod_dir.so
LoadModule alias_module modules/mod_alias.so
LoadModule rewrite_module modules/mod_rewrite.so
# IMPORTANT NOTE re autoindex_module: we don't want to turn on Indexes
+,
# but if we don't load autoindex_module, a 404 message is given (no in
+dex.htm[l], instead of a 403 forbidden!
LoadModule autoindex_module modules/mod_autoindex.so
# Load WebLogic's proxy module
LoadModule weblogic_module modules/mod_wl_20.so
# Load PHP 5 module
LoadModule php5_module modules/libphp5.so
# To use PHP uncomment the following lines
AddType application/x-httpd-php .php
AddType application/x-httpd-php-source .phps
# Load LDAP stuff
LoadModule ldap_module modules/mod_ldap.so
LoadModule auth_ldap_module modules/mod_auth_ldap.so
<IfModule mod_perl.c>
PerlModule ModPerl::Registry
PerlModule Apache::compat
PerlModule Apache::ServerRec
<Files *.pl>
SetHandler perl-script
PerlHandler ModPerl::Registry::handler
Options +ExecCGI
PerlOptions +ParseHeaders
</Files>
</IfModule>
<IfModule mod_cgid.c>
#
# Additional to mod_cgid.c settings, mod_cgid has Scriptsock <path>
# for setting UNIX socket for communicating with cgid.
#
Scriptsock logs/cgisock
</IfModule>
Personally, I never studied mod_perl and perhaps all our scripts are using legacy CGI modules. I think i need to read up on this. Should only one of mod_perl or mod_cgid.c modules be loaded? I have disabled mod_perl currently, and I am re-running the test. Updates to come shortly. | [reply] [d/l] |
|
|
The test proved unsuccessful with mod_perl disabled. Perhaps the next steps for me to take would be to look into migrating all our scripts to use mod_perl and disable mod_cgid.
Huge thanks to everyone for their collaboration so far!
| [reply] |
Re: Apache / 30,000 Perl script serving limit
by holli (Abbot) on May 07, 2009 at 06:14 UTC
|
| [reply] [d/l] |
|
|
I have done a few grep around, and i am still looking into it, but in the meanwhile I have stumbled upon this HP-UX information sheet, and we can see the 30000 number coincides with the failsafe and default value for process_id_max. But these settings seem to work correctly, past pid 30000 the iteration starts over with a low pid number and continues to increment. Still, impossible to call more than 30,000 perl scripts per apache 'restarts'.
| [reply] |
Re: Apache / 30,000 Perl script serving limit
by QcMonk (Novice) on May 08, 2009 at 15:44 UTC
|
Today we have installed Apache/2.2.8 HP-UX_Apache-based_Web_Server (Unix) mod_perl/1.99_16 Perl/v5.8.8 DAV/2. And guess what? The issue is non-existent. While we have not found what caused the issue on the previous version of Apache, or which if any, of all our production environment configurations we have that causes the issue, but we will sequentially replicate the production environment with the new software and we will see where/if it re-appears. It may turn out that the new version successfully resolves the problem
I would of liked to reply this to all who contributed, to tell them that I very much esteem them and this community for all their good intentions and willingness to help.
Here's to hoping I will be able to help some of the newer folks here as well down the line
| [reply] |
|
|
Did anyone have any more time to look at this again. I have a very similar issue and would love to know what caused it in version 2.0.59. I am running HP-UX B.11.31 U ia64.
I have tried all of the above suggestions and so far nothing seems to assist with the issue.
Regards
Ben
| [reply] |
|
|
In your httpd.conf, try removing mod_cgi and enable only mod_perl, for us this solved the 30,000 perl script execution limit even on the Apache 2.0 web server, but it can mean that some of your cgi scripts will need to be modified to be compatible with mod_perl. A few of ours required a few 'my $variables' to be changed to 'our $variables'. This porting documentation helped.
| [reply] |
Re: Apache / 30,000 Perl script serving limit
by ig (Vicar) on May 07, 2009 at 01:09 UTC
|
Have you tried setting MaxRequestsPerChild to something other than 0? Doing so won't identify the limit that is being hit but it may help localize it. If changing this makes the problem disappear then you know you are dealing with a per-server-process limit of some sort.
| [reply] |
|
|
Doesn't the error tell you that?
| [reply] |
|
|
No, it doesn't. It does tell you that there is a resource unavailable but it doesn't say what resource or why it is unavailable. It may be a per-process resource unavailable because a per-process limit has been exceeded, or it may be some other resource or unavailable for some other reason.
Cycling the server processes after some reasonable number of requests will do little to identify what resource or why it is unavailable but it is easy to do and it will either be consistent with expectations or not - just another peice of slightly relevant information that can be added to the puzzle. It is similar to but not exactly the same as a graceful restart of the server, which we already know resets the counter (whatever form the counter to 30,000 takes).
update: Thinking of this further, it is odd that the limit would always be reached at exactly 30,000 requests if there is a per-process limit as there are effectively up to 6 server processes running (MaxClients 150 and ThreadsPerChild 25). Unless the 30,000 requests were allocated across the child processes strictly round robin or always to a single process, one would expect to succeed with a variable number of requests until, somewhat randomly, one of the child processes reached its per-process limit, after which one might expect intermittent results until all child processes had reached the limit. This suggests that the resource constraint is per user or system wide, perhaps, rather than per-process.
| [reply] |
|
|
| [reply] |
|
|
So the constraint must be common to all the processes, not within or applied to each process separately. And, as you get the same result with various CGI scripts, it is unlikely that it is the CGI scripts that are consuming the resource. This implies that the server is consuming the resource (either the common server process or the child processes).
It has been suggested to run strace (I understand tusc is the equivalent on HP-UX) on the server to see what is happening when it fails. An alternative would be to attach a debugger to one of the failing server processes and walk through to the failing system call. It might help to be certain what system call is failing.
| [reply] |