in reply to Re^2: Apache / 30,000 Perl script serving limit
in thread Apache / 30,000 Perl script serving limit

(11)Resource temporarily unavailable: couldn't create child process: 11: environment.pl

That error message is produced by this snippet in mod_cgi(d).c of the Apache sources (in routine run_cgi_child()):

rc = ap_os_create_privileged_process(r, procnew, command, argv, en +v, procattr, p); if (rc != APR_SUCCESS) { /* Bad things happened. Everyone should have cleaned up. */ ap_log_rerror(APLOG_MARK, APLOG_ERR|APLOG_TOCLIENT, rc, r, "couldn't create child process: %d: %s", rc, apr_filename_of_pathname(r->filename)); }

If you dig a bit deeper, you'll find that eventually fork() is being called (unsurprisingly), i.e.

if ((new->pid = fork()) < 0) { return errno; }

in ./srclib/apr/threadproc/unix/proc.c, in the routine apr_proc_create().

The errno eventually ends up in rc, which is being reported as 11 (numerically), or as "Resource temporarily unavailable" (text form). The corresponding symbolic form is EAGAIN, which - if you look in HP-UX's fork(2) manpage - is being returned under these two circumstances:

ERRORS If fork() fails, errno is set to one of the following values. [EAGAIN] The system-imposed limit on the total number + of processes under execution would be exceeded. [EAGAIN] The system-imposed limit on the total number + of processes under execution by a single user w +ould be exceeded.

The former limit is the "nproc" setting (unlikely to be the cause here, as you can still run other programs); the latter limit is the already mentioned "maxuprc" tunable.  In other words, I'd say the theory fits too well to be ruled out completely, yet... :)

Monitoring the OS we found the total number of process used every each hour(at XX:00) was never exceeding 10 and represented about 0.9% usage.

How exactly did you investigate this?  Are you sure you don't have any zombies lingering around, or some such. What do ps and top say when the limit has been reached?  Is the limit exactly 30000, or maybe 30K, with K being 1024? Do you really need to reboot the machine, or is simply restarting Apache sufficient? (use stop, start - not restart or graceful - to be sure to actually get a new process for the Apache parent)

Replies are listed 'Best First'.
Re^4: Apache / 30,000 Perl script serving limit
by QcMonk (Novice) on May 06, 2009 at 19:22 UTC

    I can tell you are very knowledgable in these things, Almut. I will try to provide awnsers to these questions as best as I can. For the number of process used each hour, the information has been given to me by a system Admin. We used HP-UX kcusage (watching maxuprc) and while the test was running, we watched the nproc value, and it barely increased.

    I shall also enquire into the info in ps and top, I have not yet done this

    Finally, the restart is merely graceful restart from the webmin console, and all is well thereafter for another 30,000 request exactly.

    To see if it changes anything, I will now proceed to Stop and Start the server instead.

      Just to follow-up on the restarting: if "graceful" already suffices to reset the problem, using "stop" and "start" instead wouldn't contribute anything to solving the issue... — I guess I just misread your "require a reboot" to mean the machine would need to be rebooted (which would've surprised me...)

Re^4: Apache / 30,000 Perl script serving limit
by QcMonk (Novice) on May 06, 2009 at 19:57 UTC

    Alright, same thing after the stop and start as you already figured.

    Perhaps worthy of notice is how its a new PID every so often.

    www 11004 7696 2 15:46:29 ? 0:00 /usr/bin/perl /web_sites/intranet2 +/cgi-bin/environment.pl www 11067 7696 4 15:46:31 ? 0:00 /usr/bin/perl /web_sites/intranet2 +/cgi-bin/environment.pl www 11210 7696 2 15:46:37 ? 0:00 /usr/bin/perl /web_sites/intranet2 +/cgi-bin/environment.pl www 11264 7696 4 15:46:39 ? 0:00 /usr/bin/perl /web_sites/intranet2 +/cgi-bin/environment.pl

      What seems to be more interesting than that they do get a new PID (which is to be expected with mod_cgid) is that they're still running after several seconds (observe the STIME column) — presuming the output you've shown is of a single ps call, of course.  In case environment.pl is just printing out the environment, this shouldn't take several seconds... so I would try to find out why that is...

      BTW, even on HP-UX, you can get a nice tree-like display of processes with ps (similarly to what option -f does on Linux). This is often helpful to easily see which processes forked which...  For this, you'd need to enable XPG4 mode, which you do via the env variable UNIX95. This makes option -H available, so you can then write, for example

      $ UNIX95=1 ps -efH

      (not sure if you're aware of it, so I thought it might be worth mentioning...)

        You are quite the observer my friend, Those lines were from a series of calls ps -fu www | grep environment.pl