QcMonk has asked for the wisdom of the Perl Monks concerning the following question:

Greetings,

Would anyone have any insight as to why Apache would require a graceful restart after exactly 30000 requests to perl scripts on a webserver?

To test I have the following script running on the shell of the server:

#!/usr/bin/perl use strict; use warnings; use lib "/web_sites/cgi-modules"; use LWP::Simple; use LWP::UserAgent; use HTTP::Request; use HTTP::Response; use HTML::LinkExtor; my $url = "http://devintranet2/cgi-bin/environment.pl"; my $counter = 0; while ($counter < 60000) { my $browser = LWP::UserAgent->new(); $browser->timeout(10); my $request = HTTP::Request->new(GET => $url); my $response = $browser->request($request); if ($response->is_error()) { #printf "%s\n", $response->status_line; $counter = $counter + 1; print "$counter ["; printf "%s]\n", $response->status_line; die; } my $contents = $response->content(); #print "$contents"; #sleep(0.3); $counter = $counter + 1; print "$counter ["; printf "%s]\n", $response->status_line; }
We have apache v.2 and perl 5.8. running on HP-UX. Subsequent calls give 500 Internal Server Error. (everything else on the web server is functional and from the shell, perl scripts still run fine indefinately). The httpd.conf has the following, but changing values does not affect the 30k max requests before CGI becomes unavailable:
<IfModule worker.c> ServerLimit 16 StartServers 2 MaxClients 150 MinSpareThreads 25 MaxSpareThreads 75 ThreadsPerChild 25 MaxRequestsPerChild 0 </IfModule>
(I beg forgiveness if this question is more mod_perl / apache related) Thanks everyone.

Replies are listed 'Best First'.
Re: Apache / 30,000 Perl script serving limit
by almut (Canon) on May 06, 2009 at 14:59 UTC
    ... Subsequent calls give 500 Internal Server Error.

    In case of an Internal Server Error there's usually some corresponding message in the webserver's error log, which in this case might give a clue as to what exactly is enforcing the limit. If you find nothing there, you might also want to look in the syslog, etc.

      Verily so,
      apache-default error log contains:

      [Wed May 06 10:25:40 2009] [error] (11)Resource temporarily unavailable: couldn't create child process: 11: environment.pl

      the virtual server error log has:

      [Wed May 06 10:25:40 2009] [error] [client 198.73.83.73] Premature end of script headers: environment.pl

      I have googled "Resource temporarily unavailable: couldn't create child process" quite often and extensively, but found no one with similar problem

        A quick googling found this for me. I haven't investiged this any further yet, but it seems to hint at the "maxuprc" kernel parameter, or some related setting. Check with /usr/sbin/kmtune what your settings are...

        (11)Resource temporarily unavailable: couldn't create child process: 11: environment.pl

        That error message is produced by this snippet in mod_cgi(d).c of the Apache sources (in routine run_cgi_child()):

        rc = ap_os_create_privileged_process(r, procnew, command, argv, en +v, procattr, p); if (rc != APR_SUCCESS) { /* Bad things happened. Everyone should have cleaned up. */ ap_log_rerror(APLOG_MARK, APLOG_ERR|APLOG_TOCLIENT, rc, r, "couldn't create child process: %d: %s", rc, apr_filename_of_pathname(r->filename)); }

        If you dig a bit deeper, you'll find that eventually fork() is being called (unsurprisingly), i.e.

        if ((new->pid = fork()) < 0) { return errno; }

        in ./srclib/apr/threadproc/unix/proc.c, in the routine apr_proc_create().

        The errno eventually ends up in rc, which is being reported as 11 (numerically), or as "Resource temporarily unavailable" (text form). The corresponding symbolic form is EAGAIN, which - if you look in HP-UX's fork(2) manpage - is being returned under these two circumstances:

        ERRORS If fork() fails, errno is set to one of the following values. [EAGAIN] The system-imposed limit on the total number + of processes under execution would be exceeded. [EAGAIN] The system-imposed limit on the total number + of processes under execution by a single user w +ould be exceeded.

        The former limit is the "nproc" setting (unlikely to be the cause here, as you can still run other programs); the latter limit is the already mentioned "maxuprc" tunable.  In other words, I'd say the theory fits too well to be ruled out completely, yet... :)

        Monitoring the OS we found the total number of process used every each hour(at XX:00) was never exceeding 10 and represented about 0.9% usage.

        How exactly did you investigate this?  Are you sure you don't have any zombies lingering around, or some such. What do ps and top say when the limit has been reached?  Is the limit exactly 30000, or maybe 30K, with K being 1024? Do you really need to reboot the machine, or is simply restarting Apache sufficient? (use stop, start - not restart or graceful - to be sure to actually get a new process for the Apache parent)

        A common cause is forking lots of children without reclaiming the zombies using wait or waitpid.
Re: Apache / 30,000 Perl script serving limit
by BrowserUk (Patriarch) on May 06, 2009 at 17:49 UTC
Re: Apache / 30,000 Perl script serving limit
by derby (Abbot) on May 06, 2009 at 17:53 UTC

    Well ... what does environment.pl do? And what if any other apache mods do you enabled?

    -derby

      environment.pl doesn't do much, it lists environment variables and their values, but I tested with many other scripts we have, such as language change scripts, survey engines, w3c validator, etc.. all results in the same error at the same iteration.

      Modules listed bellow

      LoadModule access_module modules/mod_access.so LoadModule auth_module modules/mod_auth.so LoadModule include_module modules/mod_include.so LoadModule log_config_module modules/mod_log_config.so LoadModule env_module modules/mod_env.so LoadModule setenvif_module modules/mod_setenvif.so LoadModule mime_module modules/mod_mime.so LoadModule status_module modules/mod_status.so LoadModule cgid_module modules/mod_cgid.so LoadModule dir_module modules/mod_dir.so LoadModule alias_module modules/mod_alias.so LoadModule rewrite_module modules/mod_rewrite.so # IMPORTANT NOTE re autoindex_module: we don't want to turn on Indexes +, # but if we don't load autoindex_module, a 404 message is given (no in +dex.htm[l], instead of a 403 forbidden! LoadModule autoindex_module modules/mod_autoindex.so # Load WebLogic's proxy module LoadModule weblogic_module modules/mod_wl_20.so # Load PHP 5 module LoadModule php5_module modules/libphp5.so # To use PHP uncomment the following lines AddType application/x-httpd-php .php AddType application/x-httpd-php-source .phps # Load LDAP stuff LoadModule ldap_module modules/mod_ldap.so LoadModule auth_ldap_module modules/mod_auth_ldap.so <IfModule mod_perl.c> PerlModule ModPerl::Registry PerlModule Apache::compat PerlModule Apache::ServerRec <Files *.pl> SetHandler perl-script PerlHandler ModPerl::Registry::handler Options +ExecCGI PerlOptions +ParseHeaders </Files> </IfModule> <IfModule mod_cgid.c> # # Additional to mod_cgid.c settings, mod_cgid has Scriptsock <path> # for setting UNIX socket for communicating with cgid. # Scriptsock logs/cgisock </IfModule>

      Personally, I never studied mod_perl and perhaps all our scripts are using legacy CGI modules. I think i need to read up on this. Should only one of mod_perl or mod_cgid.c modules be loaded? I have disabled mod_perl currently, and I am re-running the test. Updates to come shortly.

        The test proved unsuccessful with mod_perl disabled. Perhaps the next steps for me to take would be to look into migrating all our scripts to use mod_perl and disable mod_cgid.

        Huge thanks to everyone for their collaboration so far!

Re: Apache / 30,000 Perl script serving limit
by holli (Abbot) on May 07, 2009 at 06:14 UTC
    Is it just me or does the number of 30,000 feel "artificial" for a computer? I mean normally all "natural" boundaries are powers of 2. So as a matter of despair: Do you see something suspicious when you do a system wide full text search for that number? (Maybe start looking in /etc and ~ first)


    holli

    When you're up to your ass in alligators, it's difficult to remember that your original purpose was to drain the swamp.

      I have done a few grep around, and i am still looking into it, but in the meanwhile I have stumbled upon this HP-UX information sheet, and we can see the 30000 number coincides with the failsafe and default value for process_id_max. But these settings seem to work correctly, past pid 30000 the iteration starts over with a low pid number and continues to increment. Still, impossible to call more than 30,000 perl scripts per apache 'restarts'.

Re: Apache / 30,000 Perl script serving limit
by QcMonk (Novice) on May 08, 2009 at 15:44 UTC

    Today we have installed Apache/2.2.8 HP-UX_Apache-based_Web_Server (Unix) mod_perl/1.99_16 Perl/v5.8.8 DAV/2. And guess what? The issue is non-existent. While we have not found what caused the issue on the previous version of Apache, or which if any, of all our production environment configurations we have that causes the issue, but we will sequentially replicate the production environment with the new software and we will see where/if it re-appears. It may turn out that the new version successfully resolves the problem

    I would of liked to reply this to all who contributed, to tell them that I very much esteem them and this community for all their good intentions and willingness to help.

    Here's to hoping I will be able to help some of the newer folks here as well down the line

      Did anyone have any more time to look at this again. I have a very similar issue and would love to know what caused it in version 2.0.59. I am running HP-UX B.11.31 U ia64. I have tried all of the above suggestions and so far nothing seems to assist with the issue. Regards Ben

        In your httpd.conf, try removing mod_cgi and enable only mod_perl, for us this solved the 30,000 perl script execution limit even on the Apache 2.0 web server, but it can mean that some of your cgi scripts will need to be modified to be compatible with mod_perl. A few of ours required a few 'my $variables' to be changed to 'our $variables'. This porting documentation helped.

Re: Apache / 30,000 Perl script serving limit
by ig (Vicar) on May 07, 2009 at 01:09 UTC

    Have you tried setting MaxRequestsPerChild to something other than 0? Doing so won't identify the limit that is being hit but it may help localize it. If changing this makes the problem disappear then you know you are dealing with a per-server-process limit of some sort.

      Doesn't the error tell you that?

        No, it doesn't. It does tell you that there is a resource unavailable but it doesn't say what resource or why it is unavailable. It may be a per-process resource unavailable because a per-process limit has been exceeded, or it may be some other resource or unavailable for some other reason.

        Cycling the server processes after some reasonable number of requests will do little to identify what resource or why it is unavailable but it is easy to do and it will either be consistent with expectations or not - just another peice of slightly relevant information that can be added to the puzzle. It is similar to but not exactly the same as a graceful restart of the server, which we already know resets the counter (whatever form the counter to 30,000 takes).

        update: Thinking of this further, it is odd that the limit would always be reached at exactly 30,000 requests if there is a per-process limit as there are effectively up to 6 server processes running (MaxClients 150 and ThreadsPerChild 25). Unless the 30,000 requests were allocated across the child processes strictly round robin or always to a single process, one would expect to succeed with a variable number of requests until, somewhat randomly, one of the child processes reached its per-process limit, after which one might expect intermittent results until all child processes had reached the limit. This suggests that the resource constraint is per user or system wide, perhaps, rather than per-process.

      I have tried setting MaxRequestsPerChild to different values and the outcome is that it takes much longer for the script to make the calls, but it will again start returning error 500 on the 30 thousandth request.

        So the constraint must be common to all the processes, not within or applied to each process separately. And, as you get the same result with various CGI scripts, it is unlikely that it is the CGI scripts that are consuming the resource. This implies that the server is consuming the resource (either the common server process or the child processes).

        It has been suggested to run strace (I understand tusc is the equivalent on HP-UX) on the server to see what is happening when it fails. An alternative would be to attach a debugger to one of the failing server processes and walk through to the failing system call. It might help to be certain what system call is failing.