Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Collecting ping times

by EyeOpener (Scribe)
on Aug 18, 2003 at 16:41 UTC ( [id://284635]=perlquestion: print w/replies, xml ) Need Help??

EyeOpener has asked for the wisdom of the Perl Monks concerning the following question:

I wrote a script to log ping latencies for a bunch of hosts on our network. It works well for the most part, but there's a bug that rears it's head sporadically, and I haven't been able to track it down.

Here's the meat of the code (some vars are defined earlier in the script):

my $p = Net::Ping->new("icmp",$timeout,32); while (1) { foreach my $host (@hosts) { my @times; for (1..4) { my $t0 = [gettimeofday]; my $alive = $p->ping($host); my $t1 = tv_interval ($t0, [gettimeofday]); push @times, $alive ? # was ping successful? int($t1 * 1000) : # if so, record interval $timeout * 1000; # if not, record timeout val } my ($sec, $min, $hr, $day, $month, $year) = (localtime)[0..5]; my $timestamp = sprintf("%04d-%02d-%02d,%02d:%02d:%02d", $year + 1900, $month + 1, $day, $hr, $min, $sec); print "$timestamp,$host,", join ( ',', @times ), "\n"; } sleep $yawn; }
This produces output that looks like:

2003-08-15,17:02:19,10.10.10.50,62,46,62,46 2003-08-15,17:02:20,10.10.10.60,109,46,62,62 2003-08-15,17:02:20,10.10.10.70,93,62,46,62 2003-08-15,17:02:20,10.10.10.80,109,62,46,62 2003-08-15,17:02:21,10.10.10.90,93,62,62,46 2003-08-15,17:02:23,10.10.10.50,46,62,46,62 2003-08-15,17:02:23,10.10.10.60,46,62,62,62 2003-08-15,17:02:23,10.10.10.70,46,62,46,62 2003-08-15,17:02:23,10.10.10.80,62,46,62,46 2003-08-15,17:02:24,10.10.10.90,62,62,46,62
This is handy for importing into a spreadsheet or database, averaging the ping times, etc.

Now, the problem. If I let this run for several minutes, eventually I get output like this:

2003-08-15,17:11:04,10.10.10.50,46,62,46,46 2003-08-15,17:11:04,10.10.10.60,62,62,46,62 2003-08-15,17:11:05,10.10.10.70,62,62,46,62 2003-08-15,17:11:05,10.10.10.80,46,62,62,46 2003-08-15,17:11:05,10.10.10.90,62,46,62,62 2003-08-15,17:11:07,10.10.10.60,2003-08-15,17:11:08,10.10.10.70,2003-0 +8-15,17:11:08,10.10.10.80,2003-08-15,17:11:08,10.10.10.90,2003-08-15, +17:11:11,10.10.10.50,2003-08-15,17:11:11,10.10.10.60,2003-08-15,17:11 +:11,10.10.10.70,2003-08-15,17:11:11,10.10.10.80,2003-08-15,17:11:12,1 +0.10.10.90,
In case that lost or gained some formatting during posting, the problem is that the code eventually skips one host, then stops printing the @times values and the following newline for all successive iterations. The code continues running and logging, but appends only the date, time, and hostname. This last line grows longer ad infinitum.

I'm only just starting to troubleshoot this, but I need to get it running pretty quickly, so I thought I'd open it up to the Monks as well. Is there a flaw in the code, or is there a network condition that could cause this oddness?

Thanks!

Replies are listed 'Best First'.
Re: Collecting ping times
by CombatSquirrel (Hermit) on Aug 18, 2003 at 17:11 UTC
    I don't see any problem with the code, but my first guess would be that something happened to @times. I would insert a warn "scalar \@times = " . scalar(@times) if (@times != 4); to check on that, but actually I am as puzzled as you are.
    Good luck hunting down the bug!

    Added second thought: I am not really into networking and online stuff, but what about those DOS attacks or something similar? Could it be the remote server trying to intervene when it supposes an attack?
      Thanks, and I think you're right about @times. That must be where something(?) is happening. It's a sporadic problem, so I can't force it, but I'll let you know what I find.
Re: Collecting ping times
by valdez (Monsignor) on Aug 18, 2003 at 17:14 UTC

    I bet it is a problem with your console and not with your Perl program. Please, redirect your output to a file and check those results; I was fooled many times by that strange behaviour of Linux console.
    It simply doesn't make sense that a print statement forgets to print a newline, does it?

    Ciao, Valerio

Re: Collecting ping times
by Cine (Friar) on Aug 18, 2003 at 18:07 UTC
    Smokeping does just what you want...

    T I M T O W T D I
Re: Collecting ping times
by BrowserUk (Patriarch) on Aug 18, 2003 at 17:37 UTC

    I'd lay odds that if you enable warnings, you will find that one or more of the values in @times is undefined. With the result that the join doesn't complete, and the print statement aborts before the final "\n" is printed. Hence the effect.


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
    If I understand your problem, I can solve it! Of course, the same can be said for you.

      Warnings report any detected issue but the code continues on doing what perl was best able to guess what should happen. Neither the join() function nor the print will "abort" on an undef. Perl just promotes undef to ''.
      @f = qw(a b c d e f g h); $f[3] = undef; print join('-', @f), $/; __OUTPUT__ a-b-c--e-f-g-h

      --
      [ e d @ h a l l e y . c c ]

      This was my thinking also, that one or more elements are undefined. However, even if that affects one pass, wouldn't the whole thing be cleared up in the next iteration? That is, shouldn't @times be effectively redefined for the next host? I can't figure out why every successive pass is screwed up in this way.

      Since I can't force the problem, I have to wait for it to occur, which may be minutes or hours. I'd like to sprinkle some substantive t-shooting statements in here, but I can't figure out what will capture useful info. All suggestions welcome!

        Sorry! Amazing how you I can see what you I want to see, when I made the post it all made perfect sense, but I guess (as has been comprehensively brought to my attention:), I would have lost that bet...again.

        I've thrown everything I can think of at this code and I simply cannot make it fail, or produce the output you are seeing.

        Maybe if you posted the whole script someone would spot something. It really boggles the mind as to how the same print statement could print the first half of the output and omit the second. I tried redefining join to return undef, the empty list and several other variations, but the newline was always printed. I tried turning buffering off and on, and futzing with STDOUT in another thread, but I (now, belatedly, once my failing have been pointed out to me) cannot find a way to make this happen.


        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
        If I understand your problem, I can solve it! Of course, the same can be said for you.

Re: Collecting ping times
by EyeOpener (Scribe) on Aug 18, 2003 at 19:20 UTC
    Aargh, I just realized I left out an important line from earlier in the script, in case some of you are wondering where gettimeofday and tv_interval came from:

    use Time::HiRes qw( gettimeofday tv_interval );

    Apologies...

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://284635]
Approved by blue_cowdawg
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (4)
As of 2024-03-29 06:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found