I assume it is your script, and it is cron'd? I used to monitor roughly like this:
if (problem) { fixIt; logIt; if (-f $indicator) { pageMe; unlink $indicator; } else { touch $indicator; } } else { unlink $indicator; }
Thus, I'd get paged only if a problem persisted. With reliable "fixIt" and detection methods, and an appropriate cron schedule, that served me well.
If your condition isn't programmatically fixable, I'd change that around to perhaps this:
if (problem) { logIt; if (! -f $indicator) { pageMe; touch $indicator; } } else { unlink $indicator; }
Thus, you'd only get paged initially. Or you might also page yourself when the error is cleared; or check the timestamp of the indicator and page every hour; or whatever.
(I remember a similar bill when first testing such scripts, despite "unlimited" texts on my pager...)
Update: The best way to limit alerts is to limit alert conditions...
pileofrogs, what was the condition that lasted several days? How was it resolved? Could it not have been resolved automatically (with perl)?
In reply to Re: Module to limit email floods?
by hbm
in thread Module to limit email floods?
by pileofrogs
For: | Use: | ||
& | & | ||
< | < | ||
> | > | ||
[ | [ | ||
] | ] |