in reply to CGI Out of Control
Apache 1.3.22 has some issues, probably unrelated but you should consider upgrading to 1.3.27.
If you are letting every man and his dog write and run CGI scripts you are asking for troubles. It could be that someone has written a script like this
#!/usr/bin/perl fork() while 1 # this will fork 'em
which will bring any server to its knees. Perhaps you have someone's CGI acting as a spam gateway that is being used to hammer your server.
It seems like you should be looking towards instituting some decent process/memory/cpu load monitoring with a view to seeing what happens just before your server crashes. I presume you have gone as far as checking the httpd/log files and know about top?
This is as good a place as any to start
Here is a really basic Perl monitoring tool for you at a bargain basement price
#!/usr/bin/perl my $logfile = '/var/log/top'; my $max_size = 10**6; my $max_files = 10; my $delay = 2; my $count = 0; my $num = 0; while (1) { my $time = scalar localtime; # rotate logfiles so they don't get too big if ( -e "$logfile$num.log" and -s "$logfile$num.log" > $max_size ) + { $count++; $num = $count % $max_files; unlink "$logfile$num.log" if -e "$logfile$num.log"; } my $top = `top -n1`; open LOG, ">>$logfile$num.log" or die "Can't write $logfile $!\n"; print LOG $time, "\n", $top, "\n\n"; close LOG; sleep $delay; }
Just background this with & and check the logs after a crash. Adjust the logfile size/number and sleep granularity to suit yourself. An average top record will be about 2K so you will fill a 1MB file roughly every 20 minutes with a 2 second granularity. 10 files lets you monitor the last 3 hours or so. You can munge the data to your hearts content. You will be interested in the last file written pre-crash which will also be the smallest as it will only be partially written. Note that the rotation is circular with old logs overwritten.
You will end up with a file full of this:
Mon Feb 10 14:41:55 2003 2:47pm up 160 days, 1:22, 3 users, load average: 0.40, 0.26, 0.19 28 processes: 25 sleeping, 1 running, 0 zombie, 2 stopped CPU0 states: 0.1% user, 1.0% system, 0.0% nice, 97.0% idle CPU1 states: 0.0% user, 1.0% system, 0.0% nice, 98.0% idle Mem: 1551228K av, 1539476K used, 11752K free, 0K shrd, 7982 +8K buff Swap: 1534072K av, 768592K used, 765480K free 45278 +4K cached PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND 26290 root 16 0 1248 1228 1024 S 0.9 0.0 0:00 sshd 5031 root 20 0 1016 1016 828 R 0.9 0.0 0:00 top 1 root 8 0 424 388 372 S 0.0 0.0 1:29 init 29045 root 9 0 356 304 288 S 0.0 0.0 0:22 syslogd 29134 root 9 0 376 272 236 S 0.0 0.0 0:00 sshd 29160 root 9 0 468 356 288 S 0.0 0.0 0:05 xinetd 29188 root 9 0 220 56 56 S 0.0 0.0 0:00 safe_mys +qld 29234 mysql 13 5 3908 1484 1336 S N 0.0 0.0 0:00 mysqld 29244 mysql 13 5 3908 1484 1336 S N 0.0 0.0 0:52 mysqld 29245 mysql 13 5 3908 1484 1336 S N 0.0 0.0 0:00 mysqld 29246 root 9 0 488 312 296 S 0.0 0.0 2:17 httpd 29247 mysql 13 5 3908 1484 1336 S N 0.0 0.0 0:00 mysqld 29277 root 9 0 168 128 88 S 0.0 0.0 0:07 crond 28116 apache 9 0 1092 1040 876 S 0.0 0.0 0:00 httpd 28117 apache 9 0 1084 1024 872 S 0.0 0.0 0:00 httpd 28118 apache 9 0 1092 1040 888 S 0.0 0.0 0:00 httpd [snip]
cheers
tachyon
s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print
|
|---|