| [reply] |
That problem already exists. If the script is dying a lot,
it doesn't matter what respawn mechanism is used. There will
still be a significant load placed on the system.
The problem description, though, makes it sound like it only
happens occassionally, which has made it hard to debug. In
that case, I would rather let a proven and well-known
mechanism like initd do the monitoring for me than having to
debug both the respawn code and the code that is dying in the
first place.
Second, initd is usually pretty smart and stops trying
a job if it is respawning too quickly. So your load increase
for a bit, but initd does the Right Thing and stops it from
becoming a fork bomb.
mikfire
| [reply] |
Yes, you are right about the init stopping if the process
respawns too quickly. I forgot that, it's true.
You can stop the script manually instaed of waiting
for init to decide, but that's no big deal. More
important is that a well chosen sleep time between
restarts will keep your system responsive anyway.
That would also be a crude workaround if the cause for
the programs death is temporary (say, a missing
resource like a nfs share etc) and the program just
"die"s instead of doing a wait-retry-cycle itself.
Finally, logging and/or an alert mechanism can
easily be implemented.
To prevent a misunderstanding: one could (and maybe
should) put that code into the original program,
so the admin can leave the watch-respwan work to
init.
Andreas
| [reply] |
Most versions of init that I've run into
will notice that an entry has restarted over and over in
a short time and just complain to the console that entry
"xyz" is restarting too much and it won't restart it anymore
until you change the inittab.
You see this a lot with flaky terminals (people still
know what a terminal is, don't they?) where getty keeps croaking and init just gives up
on it.
I think init is perfect solution for this
problem if you have access to it.
| [reply] [d/l] [select] |