I tracked down a bug today in the SNMP support library used by MRTG. I have a fix for it, but I'm not sure whether it's an ugly hack, the Right Thing or if there's a different angle for a better way to do it.

I've been using MRTG for a while now, to watch Cisco routers (even if they do have really buggy snmp support), HP switches and sundry servers from a number of manufacturers. I'm currently setting up a VPN using Netscreen routers and ADSL lines, and I wanted to monitor them. As soon as I plugged them in, I started getting really weird non-reproducible errors. Sometimes the router would respond, sometimes not, and no way to understand why.

After futzing around with snmpwalk and some other support tools in the NetSaint kit, I felt pretty sure that the Netscreen routers were working correctly, it was the MRTG code that was wrong.

Once I started looking there, it was pretty quick to zoom into the offending code. In a nutshell, with SNMP when you want to ask a piece of equipment for some information, you have to tell it what community you belong to (read: authentication), what information you want, and a request id, which I surmise is kind of like a cookie.

By instrumenting the code with print statements, it became apparent that the following conditional was at fault. It would be true or false at random:

if ($response_community ne $this->{community} || ($response_id) != $this->{request_id})) { # ... }

... and that determines success or failure in receiving a response. So I printed out the values of these variables and hit the jackpot:

print "$response_community $this->{community} $response_id $this->{req +uest_id}\n"; # produces snmpfoo snmpfoo 2325544852 -1969422444

Looks like 32-bit signedness rearing its ugly head. If the MSB is zero, the test passed, if one, it failed. I had to get a job done, so I just packed the values and string compared them:

if ($response_community ne $this->{community} || pack('l',$response_id) ne pack('l',$this->{request_id})) { # ... }

So my questions are:

  1. I can't believe that this problem hasn't occurred before, somewhere in the world. Google provided a few similar leads, but nothing exactly the same (which is why I'm taking the time to be as explicit as possible... maybe someone else will put in the right keywords and pull this page out of the infinite depths). Wouldn't a problem of signedness have been found sooner?
  2. If this is a problem, is this the best way to fix it? Or is there a simpler way of coercing signed and unsigned integers to get along? I'm not overly fussed with speed: we are talking about transactions that take several dozen milliseconds if you're lucky, more likely several hundred. A cycle here or there doesn't matter.
  3. Shouldn't this problem be corrected upstream in the SNMP library, so that request ids are always of the same sign?

My personal feeling for that last one is a yes, but I'm not well versed in SNMP to know whether that's reasonable or not. I have traced the problem down to a rather horrendous piece of code in the BER.pm module called "decode_by_template", which is passed a sprintf-like string which is decoded manually. Trying to trace through it makes me want to gouge my eyes out: it's a textbook case of creature feep.

I think I'll just mail my hack to the maintainer, and see what he wants to do about it, but any recommendations or ideas will be much appreciated, before I send it off.


print@_{sort keys %_},$/if%_=split//,'= & *a?b:e\f/h^h!j+n,o@o;r$s-t%t#u'

In reply to Bug in SNMP_Session when polling Netscreen routers with MRTG by grinder

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.