in reply to Ignoring/Trapping the DIE signal

The most reliable way to do this is to write a small and very simple wrapper program to run your Perl script, and restart it if it fails. That handles nearly all of the failure modes, including Perl itself crashing. Of course, the wrapper program could itself die, so you might want to put it in /etc/inittab, so that init will automatically restart it if it dies. init could still crash, and then your OS will shut down; you should be able to set up your kernel and/or BIOS and/or watchdog timers to automatically reboot in this event. Of course, a meteor could crash into the building where your server is located, but then your Perl script is probably the least of your concerns. :-)

daemontools is a very convenient and reliable wrapper program to use for automatically restarting your script; it automatically installs itself into /etc/inittab, and so solves a great deal of the problem on its own.

Replies are listed 'Best First'.
Re^2: Ignoring/Trapping the DIE signal
by chrism01 (Friar) on Jun 15, 2006 at 02:19 UTC
    Actually, I've already got that functionality in cron & startup scripts for fatal errors, but it doesn't help if MySQL has gone away. For 1 of the progs, that's fine, but the main one is a radius (Accounting) pkt ctr/sender, with a lot of ancillary info for each rec stored in memory. If it crashes we'd lose that info, which is the main purpose of the machine.
    This prog can manage without MYSQL for a while if needed, but I'd like it to try to reconnect regularly until MySQL comes back.
    I'd be surprised if there isn't a way to do this...
      Hrm. You could certainly store the data in stable storage, like with DB_File, though that would have a performance cost. On the other hand, I've never seen a die that eval didn't deal with properly...
        Yeah, the performance cost is why I only update the DB after a successful send, and then only with basic info, not all the different pkt type counts.
        As for the eval{}, feel free to try it on the above code. What seems to happen (as per the results above) is that it can't handle repeated failures ie if MySQL doesn't come back quickly, it survives 1 or 2 DB related errors, but finally keels over, even if used with SIG{__DIE__}.
        If you can prove me wrong I'd be ecstatic :-) .
        I've also tried putting the eval around just the SQL cmd block ie from SELECT ... to ... finish, without success.