bennymack:

While sgifford offers a good solution, I thought I'd offer up one I use periodically. I often have a "watchdog" process running whose only job is to send EMails when it detects that something has gone awry. This has the added benefit of catching (some) scripts that hang in infinite loops.

My usual method is to have the simple watchdog just watch a directory. If it ever notices a file over X seconds old, it EMails it to you. So any script that wants to take advantage of it may simply put a formatted message in the watched directory, and "touch" it periodically to ensure that it's not too old. If the script dies, watchdog sends it to you when it ages out. If the script hangs without touching the file, the watchdog will still send you the EMail. But you're still scrood if the script hangs in a loop continuously touching the file.

I don't have the exact code in front of me at the moment, but it's something like this:

#!/usr/bin/perl -w use warnings; use strict; ############### # CONFIG VARS # ############### my $dir_to_watch = "/cygdrive/c/WATCHDOG"; my $timeout = 120; # Max file age (sec) my $awhile = 30; # Checking interval (sec) #-------------------------------------------------- # Sends contents of specified file to admin sub send_msg { my $fn = shift; print "You'd send $fn via EMail here...\n"; } chdir $dir_to_watch; while (1) { for (`ls`) { chomp; my $age = time - (stat)[9]; send_msg($_) if $age > $timeout; } sleep $awhile; }
(The code above is tested, except it doesn't send EMail. Plug appropriate code into send_msg.)

Then my programs (C++, etc.) and scripts use it by formatting an appropriate message to deliver on failure, and update it periodically, like:

#!/usr/bin/perl -w use warnings; use strict; ############### # CONFIG VARS # ############### my $awhile = 45; # Checking interval (sec) my $my_watched_file = "/cygdrive/c/WATCHDOG/foobar"; my $EMailText= 'To: roboticus@a.fake.domain.com Subject: JobToMonitor.pl fault! Yecch! '; #-------------------------------------------------- # Let watchdog know we're alive... sub still_alive { open OF, '>>', $my_watched_file; print OF $EMailText; close OF; } unlink $my_watched_file; my $next_time = 0; my $count = 0; while ($count < 999999999999) { # put this in a part that's likely to be OK if (time > $next_time) { &still_alive; $next_time = time + $awhile; } # smallish chunks of job that shouldn't take # too much time ++$count; # You can even use it for periodic logging! $EMailText = "$count reached at " . (localtime) . "\n" if $count % 1000000 == 0; } # Don't alert if we complete successfully unlink $my_watched_file;
Yeah, it's admittedly contrived, but it's a handy thing for servers running lots of odd jobs.

--roboticus


In reply to Re: Have script send email if it dies by roboticus
in thread Have script send email if it dies by bennymack

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.