jimbass has asked for the wisdom of the Perl Monks concerning the following question:

Hello, all. I have a short perl script that is a nagios plugin. It checks the RF speed of a radio. It works well, as long as the IP target exists. Today I found that when the radio I'm querying is hard down, the snmpwalk command times out, but my (admittedly very poor) code doesn't catch the timeout.
#!/usr/bin/perl -w use strict; my $radio = $ARGV[0]; my $warnrate = $ARGV[1]; my $critrate = $ARGV[2]; if ($critrate <= $warnrate) { open DATA, "/usr/bin/snmpwalk -v1 -c public $radio .1.3.6.1.4.1.5454.1 +.40.2.4.0|" or die "Failed: $!\n"; while ( defined( my $line = <DATA> ) ) { my @values = split(' ', $line); my $score = $values[3]; close DATA; if ($score > $warnrate) { print "OK, rf is $score. \|Mbps=$score\n"; exit 0 } elsif (($score <= $warnrate) && ($score > $critrate)) { print "WARNING, rf is $score. \|Mbps=$score\n"; exit 1 } elsif ($score <= $critrate) { print "CRITICAL, rf is $score. \|Mbps=$score\n"; exit 2 } else { print "UNKNOWN, rf is $score and something is wrong.\n"; exit 3 } } } else { print "Make sure your critical value is less than or equal to your war +ning value.\n"; exit 5}
I would think since the snmpwalk on a radio that doesn't exist timesout, the $score variable should either be null or 0, either of which should trigger either the unknown or critical conditions, or the whole script should exit with exit code 5. As I found today, with the radio unreachable and the snmpwalk timing out, this script still exits with an exit code of 0, indicating all is well.

I'd apprecaite any insight folks can provide into what I've done wrong here.

Replies are listed 'Best First'.
Re: What did I miss in my test condition?
by hbm (Hermit) on Aug 10, 2012 at 23:15 UTC

    I believe your if-else is flawed:

    if ($score > $warnrate) { print "OK, rf is $score. \|Mbps=$score\n"; exit 0 # $score is necessarily <= $warnrate at this point #} elsif (($score <= $warnrate) && ($score > $critrate)) { } elsif ($score > $critrate) { print "WARNING, rf is $score. \|Mbps=$score\n"; exit 1 # $score is necessarily <= $critrate at this point #} elsif ($score <= $critrate) { } else { print "CRITICAL, rf is $score. \|Mbps=$score\n"; exit 2 # should never reach this point } else { print "UNKNOWN, rf is $score and something is wrong.\n"; exit 3 }

    Also, your open DATA; while(<DATA>); close DATA seems odd. I haven't used snmpwalk before, and just tried without success; but I wonder if you can't just:

    my $line = `/usr/bin/snmpwalk -v1 -c public $radio .1.3.6.1.4.1.5 +454.1.40.2.4.0`; my @values = split(' ', $line); my $score = $values[3]; if ...
      My hope with the test conditions was that if $score was null/undefined (say from a timeout), it would fail the numeric tests, and trigger the undefined answer. Obviously that isn't the case. In the general sense $score should be a number and come up with an exit code of 0, 1, or 2, but I don't test to be certain the $score is numeric. Maybe that would be the preferred case, if $score is undefined, immeidately exit with exit code 3. Seems that is a good way to handle it, thanks for your insight! If you're ever in the NYC area, I owe you a beer! This code exits as I had hoped:
      #!/usr/bin/perl -w use strict; my $radio = $ARGV[0]; my $warnrate = $ARGV[1]; my $critrate = $ARGV[2]; my $line = `/usr/bin/snmpwalk -v1 -c PTsnmp $radio .1.3.6.1.4.1.5454.1 +.40.2.4.0`; if ($line eq '') { print "snmpwalk returned nothing, timeout likely occurred.\n"; exit 3 } if ($critrate > $warnrate) { print "Make sure your critical value is less than or equal to your war +ning value.\n"; exit 5 } if ($critrate <= $warnrate) { my @values = split(' ', $line); my $score = $values[3]; if ($score > $warnrate) { print "OK, rf is $score. \|Mbps=$score\n"; exit 0 } elsif (($score <= $warnrate) && ($score > $critrate)) { print "WARNING, rf is $score. \|Mbps=$score\n"; exit 1 } elsif ($score <= $critrate) { print "CRITICAL, rf is $score. \|Mbps=$score\n"; exit 2 } }

        I don't think $line will every be empty - you probably ought to chomp it, or test for non-whitespace.

        And you still have unnecessary tests in your if-elsif. For example, if $score is NOT greater-than $warnrate, it IS NECESSARILY less-than-or-equal-to $warnrate.

        Consider:

        use strict; my $radio = $ARGV[0]; my $warnrate = $ARGV[1]; my $critrate = $ARGV[2]; chomp(my $line = `/usr/bin/snmpwalk -v1 -c PTsnmp $radio .1.3.6.1.4.1. +5454.1.40.2.4.0`; if ($line !~ /\S/) { print "snmpwalk returned nothing, timeout likely occurred.\n"; exit 3; } elsif ($critrate > $warnrate) { print "Make sure your critical value is less than or equal to your +warning value.\n"; exit 5; } else { my $score = (split(' ', $line))[3]; my ($msg,$status) = $score > $warnrate ? ('OK',0) : ( $score > $critrate ? ('WARNING',1) : ('CRITICAL',2)); print "$msg, rf is $score. |Mbps=$score\n"; exit $status; }
Re: What did I miss in my test condition?
by DrHyde (Prior) on Aug 14, 2012 at 10:55 UTC

    Try pinging the radio to see if it's alive before making the SNMP query?

    $ ping -c 3 -W 1 $dead_host >/dev/null 2>&1 $ echo $? 1 $ ping -c 3 -W 1 $live_host >/dev/null 2>&1 $ echo $? 0
    Converting this to perl should be trivial.