ecarceller has asked for the wisdom of the Perl Monks concerning the following question:

Want to monitor a DB server for up or down status but need something more that "ping". The idea is to remotely run a query and analyze the result. If the DB server hangs the monitoring script must be able to timeout and alert about the problem. This is what I am trying but my simulated "hung" goes unnoticed to the monitoring script which hangs itself as well. Looking for suggestions. FYI: this is my very first experience with IPC.
#!/usr/bin/perl use strict; use warnings; use IPC::Open2; use IO::Select; # for select #my $cmd = "sqlplus -S apps/******"; # This is the query I really want # to run my $cmd = "ssh oraprod\@remote_server './x 100'"; # "x" executes # sleep $1 to # simulate a process # hunging on the # remote server my $timeout=1; my $infh; my $outfh; my @output; my $pid; eval{ $pid = open2($outfh, $infh, $cmd); }; die "open2: $@\n" if $@; print "PID was $pid\n"; #-------------------------- DEBUG ------------ #print "Timeout was $timeout\n"; my $sel = new IO::Select; # create a select object to notify # us on write ready. $sel->add($infh); # add the std input file handle if ($sel->can_write($timeout)){ # I will base my "DB alive" test off of this query's output #print $infh "set feedback off pages 0\nselect sysdate from dual;\n"; } else { die "Timed out while waiting for the \"Write OK\" to send the query +\n"; } print "PID was $pid\n"; #-------------------------- DEBUG ------------ $sel->add($outfh); # add the std output file handle while ($sel->can_read($timeout)) { # Returns the std output file # handle even while the sleep is # being x. # Would it do the same if "x" were # a truly hung process? # My hope is it won't so the # while's condition would be false # after timeout causing the script # to skip the while. chomp(my $line=<$outfh>); # There is nothing on "x" std output to # be read so $line does not get defined. push(@output,$line); } close $infh; print "PID was $pid\n"; #-------------------------- DEBUG ------------ waitpid $pid,0; $sel->remove($infh,$outfh); # remove std output f handle from the list unless (@output) { die "Timed out while waiting for query output\n"; } print "$_\n" for @output;

Replies are listed 'Best First'.
Re: heartbeat script. Hung-server proof.
by JavaFan (Canon) on Nov 04, 2008 at 21:22 UTC
    The classical thing to do is set a handler for SIGALRM which throws an exception (sub {die "Timeout"}), set the alarm for X seconds, and then in an eval do whatever thing you want to be able to timeout on. Last statement of the eval typically is 'alarm(0)'.

    Check whether the eval succeeded. If not, check $@. If it matches "Timeout", you timed out. Otherwise, some other exception was thrown.

    perldoc -f alarm will shown an example.

      Right on the money. Thanks a lot!!!!!!