Hi

I start all my processes via crontab using init scripts. One day I was investigating process monitoring and had epiphany - why bother adding another set of monitoring rules when the crontab describes perfectly when a process should and shouldn't be running?

I came up with the below which almost seems too simple for such great benefit

#!/usr/bin/perl -w use strict; use File::Basename; use POSIX 'strftime'; use Sys::Syslog; sub logit($); logit('starting up'); my $cron_cmd = 'crontab -l'; while ( 1 ) { my $now = strftime('%H%M', localtime); my $today = strftime('%w', localtime); for ( qx/$cron_cmd/ ) { if ( /^([\d*]+)\s+([\d*]+)\s+([\d*]+)\s+([\d*]+)\s+([\d\-*]+)\s+(. +*)/ ) { my ( $min, $hour, $day, $mon, $dow, $cmd ) = ( $1, $2, $3, $4, $ +5, $6 ); my $pattern; if ( $dow =~ m/\-/ ) { $pattern = "[$dow]"; } elsif ( $dow =~ m/,/ ) { $dow =~ s/,//; $pattern = "[$dow]"; } elsif ( $dow =~ m/^\d$/ ) { $pattern = "^$dow"; } else { $pattern = ".*"; } next unless $today =~ qr/$pattern/; if ( $cmd =~ m#(/etc/init.d/[^\s]+)\s+(start|stop)# ) { my ( $init, $mode ) = ( $1, $2 ); if ( $mode eq 'start' ) { $schedule{$init}{start} = "$hour$min"; } else { $schedule{$init}{stop} = "$hour$min"; } } } } for ( keys %schedule ) { if ( $now > $schedule{$_}{start} && $now < $schedule{$_}{stop} ) { if ( system("$_ status >/dev/null") ) { if ( $schedule{$_}{status} ne 'FAIL' || !$schedule{$_}{status} + ) { logit("$_ isnt running"); $schedule{$_}{status} = 'FAIL'; } } else { if ( defined $schedule{$_}{status} && $schedule{$_}{status} +eq 'FAIL' ) { logit("$_ is back to normal"); } $schedule{$_}{status} = 'OK'; } } } sleep 5; } logit('shutting down'); sub logit($) { my $msg = shift; openlog(basename($0), '', 'LOG_DAEMON'); syslog('info', $msg); closelog(); }
This seems to work exactly how I expect but there seems to be an issue with it falsely identifying processes which shouldn't be running anyway. For instance processes which should only be running on Sunday are falsely being flagged as not running today (Monday) when they shouldn't be anyway?

The below is triggering alerts today even though date +%w == 1

00 05 * * 0 /etc/init.d/foo start 00 22 * * 0 /etc/init.d/foo stop

the below crontab have so far caused no alerts today

00 05 * * 1-5 /etc/init.d/foo2 start 00 22 * * 1-5 /etc/init.d/foo2 stop

I can't spot the issue myself and appeal for help. Other improvements are very much appreciated also

Thank you


In reply to Process Monitoring Madness by woland

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.