Greetings, esteemed monks!

I have been developing a system that polls a series of machines every minute. The OS of all hosts is RedHat Enterprise Linux 4. The less time that process takes, the better. The polling process involves looping over an array and pinging each host. I don't care about the output from ping, just the return value. I was checking via

if ( !( system("ping -q -w 1 $host > /dev/null 2>&1") ) ) { #do something } else { #report the failure }
Well, mistake #1 was using -w instead of -c. I thought what I was doing was saying ping the host once and wait a maximum of a second. What it really meant is ping it as many times as you can in a second! I wrote a benchmarking program to time various calls to ping (using Time::HiRes) and found that once I added `-c 1`, none of the other switches was really making a difference (-a, -r, etc). I found that system("ping -q -w 1 -c 1 $host > /dev/null 2>&1"), instead of taking ~1.008 seconds, took ~.0085 seconds!

I acknowledge that any sane person would have stopped here. 8.5 milliseconds is pretty trivial in terms of a program that runs 17 times a minute (as opposed to 1.0085 milliseconds). I also must note that the procedure is neither scientific not statistically valid. However, I'd be willing to bet I could show significance (I've messed around with more rigorous methods before as here)

Well, being the habitual tinkerer that I am, I was wondering if there was any more savings to be had, and I focused on overhead of redirecting the output. I tried a bunch of different forms of redirecting output to /dev/null, with no results. So I tried

`ping -q -w 1 -c 1 $host`; #(void context)

--it reduced the average time from ~.008 seconds to ~.0045 seconds! I tried using a dummy variable:

my $dummy_var=`ping -q -w 1 -c 1 $host`;

That didn't seem to have any impact on the time.

By the principles of maintainable code, if one doesn't care about the output, one shouldn't capture it--even with a comment, it's like, "why?" (besides blind adherence to a rule on a FAQ). Not to mention the (likely negligible) overhead of assigning to a variable. Seems like void backticks is the way to go here, (with a comment of course) though I admit I haven't checked to see if having to test $? in a separate statement kills the savings. (I'd wager it doesn't)

Are there other ways that might be even more efficient? I'd be happy to benchmark them.

UPDATE: since diotalevi asked in the CB, redirecting to /dev/null in the backticks call imposes the same penalty.

UPDATE: Changed the title


I like computer programming because it's like Legos for the mind.

Replies are listed 'Best First'.
Re: I think I just found a good reason to use backticks in a void context.
by merlyn (Sage) on Jan 14, 2007 at 01:26 UTC
    The reduction in time is not because of backticks, but because you're avoiding the shell metachars, meaning that Perl can go directly to the command, rather than forking the shell first.

    If you made the test fair, by perhaps adding a semicolon to the end of your backtick command, you'd see that you're saving nothing, and perhaps even spending more time capturing things that you're just going to ignore.

      OK, I don't understand what you mean by "fair." I found a way to do something that's faster than another way of doing it. I don't get how making something faster that works is somehow perceived as a bad thing. You say

      The reduction in time is not because of backticks, but because you're avoiding the shell metachars, meaning that Perl can go directly to the command, rather than forking the shell first.
      As far as I can tell, using backticks is precisely what allows me to avoid using shell metacharacters (I am assuming you mean > and &). I guess I am having trouble understanding your position.

      Do you have a better alternative?

      I like computer programming because it's like Legos for the mind.
        Right... one that's more fair would be to fork yourself and close STDOUT and STDERR, which would probably be even faster. Something like:
        defined(my $kidpid = fork) or die "cannot fork: $!"; unless ($kidpid) { close STDOUT; close STDERR; exec "your", "multiarg", "command", "here"; die "$!"; } waitpid($kidpid, 0);

        That'd be the equivalent to your backtick-that-just-happens-to-not-invoke-the-shell, but probably even more efficient.

        Using backticks is not the only way of avoiding the shell. Pass a list to system (or to a three-arg pipe-open for convenient output suppression) and you get the same effect.

        So no, this isn’t a reason to use backticks in void context; it is merely a reason to avoid invoking the shell when you don’t need it. Even if backticks were required for that, saving a couple of microseconds here and there wouldn’t constitute a good reason to do it, as per the topic.

        Makeshifts last the longest.

        I think merlyn's point is that in the OP, you seem to be claiming that it is using backticks itself which makes the progam faster, whereas he is pointing out that the actual cause is the lack of metacharacters. So if you were to use system instead of backticks, but still without redirection or other metacharacters, you would get the same benefit. Therefore it's not really fair to claim that the backticks made the difference.
Re: I think I just found a good reason to use backticks in a void context.
by talexb (Chancellor) on Jan 14, 2007 at 01:32 UTC

    Using something in a void context means you're throwing away any return value. This may or may not be a good idea. (Hint: it's almost *never* a good idea.)

    In this particular example, since Net::Ping is available as part of the Core distribution, I'd suggest using that module instead of anything that involves shelling out to the OS.

    From the man page:

      This module contains methods to test the reachability of remote hosts on a network. A ping object is first created with optional parameters, a variable number of hosts may be pinged multiple times and then the connection is closed.
    Sounds like it's right up your alley.

    Alex / talexb / Toronto

    "Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds

      Thanks for your response.

      When you use backticks in a void context, aren't you throwing away the output, while the return value is retained in $?, as with system()?

      Either way, Net::Ping might well be even faster. I'll look into it.


      I like computer programming because it's like Legos for the mind.
Re: I think I just found a good reason to use backticks in a void context.
by OfficeLinebacker (Chaplain) on Jan 14, 2007 at 02:34 UTC

    almut, thanks for the idea. I have to admit, I am very wary of closing STDOUT, even if the job is running in cron.

    In keeping with my offer, I tried a solution using Net::Ping, which, indeed, looks like exactly what I am looking for, as talexb mentioned.

    Here are some preliminary results. I used

    #some stuff like use strict, etc. Readonly my $udp_pinger => Net::Ping->new('udp',1); #other stuff, then beginning of loop my $t0 = [ gettimeofday() ]; `ping -q -w 1 -c 1 $host`; die if ($?); my $e0 = tv_interval($t0); print "Ping with backticks took $e0 seconds"; $t0 = [ gettimeofday() ]; $udp_pinger->ping($host) || die "udp ping failed for $host"; $e0 = tv_interval($t0); print "Ping with Net::Ping took $e0 seconds"; print ".."; #end of the loop
    I put the return value checks in the timed segments to make the comparison as "fair" as possible. Also note that the program is a loop over 18 distinct hosts, not the same host repeatedly.

    Results:

    Ping with backticks took 0.004804 seconds
    Ping with Net::Ping took 0.002682 seconds
    ..
    Ping with backticks took 0.005873 seconds
    Ping with Net::Ping took 0.001528 seconds
    ..
    Ping with backticks took 0.005679 seconds
    Ping with Net::Ping took 0.001115 seconds
    ..
    Ping with backticks took 0.005465 seconds
    Ping with Net::Ping took 0.001108 seconds
    ..
    Ping with backticks took 0.005369 seconds
    Ping with Net::Ping took 0.001098 seconds
    ..
    Ping with backticks took 0.005465 seconds
    Ping with Net::Ping took 0.001521 seconds
    ..
    Ping with backticks took 0.006113 seconds
    Ping with Net::Ping took 0.001071 seconds
    ..
    Ping with backticks took 0.005269 seconds
    Ping with Net::Ping took 0.001227 seconds
    ..
    Ping with backticks took 0.005122 seconds
    Ping with Net::Ping took 0.001262 seconds
    ..
    Ping with backticks took 0.005421 seconds
    Ping with Net::Ping took 0.00123 seconds
    ..
    Ping with backticks took 0.00513 seconds
    Ping with Net::Ping took 0.001106 seconds
    ..
    Ping with backticks took 0.00501 seconds
    Ping with Net::Ping took 0.001211 seconds
    ..
    Ping with backticks took 0.005031 seconds
    Ping with Net::Ping took 0.000958 seconds
    ..
    Ping with backticks took 0.005107 seconds
    Ping with Net::Ping took 0.001256 seconds
    ..
    Ping with backticks took 0.005139 seconds
    Ping with Net::Ping took 0.000981 seconds
    ..
    Ping with backticks took 0.005484 seconds
    Ping with Net::Ping took 0.001086 seconds
    ..
    Ping with backticks took 0.003778 seconds
    Ping with Net::Ping took 0.000643 seconds
    

    Booyah! Knocks the socks off of what I had so far. I don't know what the overhead of using the module and/or instantiating the Net::Ping object are, so the comparison still isn't "fair" until you figure those and amortize them over the actual calls to ping(), but I am comfortable with the result being better than what I had. (And for more reasons than it just being faster)

    Also, the data above are (IMO) a good representative sample, and one thing I have noticed is that the first call is always the slowest. I don't know what kind of optimization happens, but the second and later pings are (so far) always at least twice as fast as the first.

    Finally, pretty convenient that 'backticks' and 'Net::Ping' have the same number of characters, eh?

    I like computer programming because it's like Legos for the mind.
        Hi, Browser, I saw that; but I saw this in the doc: "NOTE: Unlike the other protocols, the return value does NOT determine if the remote host is alive or not since the full TCP three-way handshake may not have completed yet."

        So I guess the thing to do would be to ping the hosts outside the loop and then only loop through the hosts that respond? I guess you lose a little bit of timeliness (the whole script takes about 20 seconds to run). So it's possible that when I get to the last machine in my list, I'm working on a ping success value from ~19 seconds ago. Also, a quick benchmark of the syn vs udp protocols (similar to the one where I bm'd backticks vs Net::Ping) actually showed no advantage (some were faster, some were slower) (and that's just the outward leg!):

        Ping with Net::Ping (syn) took 0.002087 seconds
        Ping with Net::Ping (udp) took 0.000932 seconds
        ..
        Ping with Net::Ping (syn) took 0.000977 seconds
        Ping with Net::Ping (udp) took 0.001136 seconds
        ..
        Ping with Net::Ping (syn) took 0.001145 seconds
        Ping with Net::Ping (udp) took 0.001183 seconds
        ..
        Ping with Net::Ping (syn) took 0.001282 seconds
        Ping with Net::Ping (udp) took 0.001024 seconds
        ..
        Ping with Net::Ping (syn) took 0.001163 seconds
        Ping with Net::Ping (udp) took 0.001019 seconds
        ..
        Ping with Net::Ping (syn) took 0.001049 seconds
        Ping with Net::Ping (udp) took 0.000696 seconds
        ..
        Ping with Net::Ping (syn) took 0.00084 seconds
        Ping with Net::Ping (udp) took 0.000815 seconds
        ..
        Ping with Net::Ping (syn) took 0.000822 seconds
        Ping with Net::Ping (udp) took 0.000707 seconds
        ..
        Ping with Net::Ping (syn) took 0.000956 seconds
        Ping with Net::Ping (udp) took 0.000822 seconds
        ..
        Ping with Net::Ping (syn) took 0.000828 seconds
        Ping with Net::Ping (udp) took 0.000734 seconds
        ..
        Ping with Net::Ping (syn) took 0.000802 seconds
        Ping with Net::Ping (udp) took 0.000704 seconds
        ..
        Ping with Net::Ping (syn) took 0.000835 seconds
        Ping with Net::Ping (udp) took 0.000821 seconds
        ..
        Ping with Net::Ping (syn) took 0.000822 seconds
        Ping with Net::Ping (udp) took 0.000712 seconds
        ..
        Ping with Net::Ping (syn) took 0.000702 seconds
        Ping with Net::Ping (udp) took 0.000824 seconds
        ..
        Ping with Net::Ping (syn) took 0.000823 seconds
        Ping with Net::Ping (udp) took 0.000719 seconds
        ..
        Ping with Net::Ping (syn) took 0.00104 seconds
        Ping with Net::Ping (udp) took 0.000782 seconds
        ..
        Ping with Net::Ping (syn) took 0.000563 seconds
        Ping with Net::Ping (udp) took 0.000474 seconds
        

        It seems to me a better way to test the performance is something like this:(it's late so forgive me if my logic is off)

        Readonly my $udp_pinger => Net::Ping->new( 'udp', 1 ); Readonly my $syn_pinger => Net::Ping->new( 'syn', 1 ); foreach my $x ( 0 .. 5 ) { #test the syn prot with two loops my $t0 = [ gettimeofday() ]; foreach my $host (@host_list) { $syn_pinger->ping($host) || die "syn ping failed for $host"; } foreach my $host (@host_list) { $syn_pinger->ack($host) || die "syn ack failed for $host"; } my $e0 = tv_interval($t0); print "$e0 seconds to ping $NUM_HOSTS hosts with syn"; #test the udp prot with one loop $t0 = [ gettimeofday() ]; foreach my $host (@host_list) { $udp_pinger->ping($host) || die "udp ping failed for $host"; } $e0 = tv_interval($t0); print "$e0 seconds to ping $NUM_HOSTS hosts with udp"; print "---------------------------"; } ## end foreach my $x ( 0 .. 5 )
        (I didn't want to go too crazy with number of iterations, but here's a pretty representative result:
        0.015293 seconds to ping 17 hosts with syn
        0.013442 seconds to ping 17 hosts with udp
        ---------------------------
        0.013814 seconds to ping 17 hosts with syn
        0.013393 seconds to ping 17 hosts with udp
        ---------------------------
        0.01381 seconds to ping 17 hosts with syn
        0.01327 seconds to ping 17 hosts with udp
        ---------------------------
        0.014273 seconds to ping 17 hosts with syn
        0.013056 seconds to ping 17 hosts with udp
        ---------------------------
        0.014191 seconds to ping 17 hosts with syn
        0.013396 seconds to ping 17 hosts with udp
        ---------------------------
        0.016453 seconds to ping 17 hosts with syn
        0.013 seconds to ping 17 hosts with udp
        

        I suppose a better test would be just just call ack non-specifically the right number of times, and keep track of who's responded. But that seems like a lot of work to save a few milliseconds (yeah, I know--kinda late to be saying that). I'm happy going from 1.008->.008->.005->.0015 seconds per host. I think I've reached the point of diminishing returns. However, if you were checking hundreds or more hosts, I could see where that could be worth it.


        I like computer programming because it's like Legos for the mind.
Re: I think I just found a good reason to use backticks in a void context.
by almut (Canon) on Jan 14, 2007 at 02:02 UTC

    How about temporarily closing STDOUT? This way you can avoid having to redirect to /dev/null, and more importantly, you now no longer need shell meta characters (which in turn avoids spawning a shell for system())

    Something like this:

    open ORIGOUT, ">&STDOUT"; close STDOUT; my $ok = ! system qw"ping -q -w 1 -c 1", $host; open STDOUT, ">&ORIGOUT"; print $ok ? "OK\n" : "Failed\n";
Re: I thought I found a good reason to use backticks in a void context, but I was wrong.
by talexb (Chancellor) on Jan 14, 2007 at 18:18 UTC

    If you are regularly pinging hosts, is this actually something that could better be handled by something like Nagios? This assumes that you're interested in a) whether the hosts are up or not, and b) what the ping performance is.

    Alex / talexb / Toronto

    "Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds

      Hey, talexb, you are prescient. The administrators of our network do run nagios, but they don't expose much of the interface to the users. I have no admin privileges. My collegaues expressed that a program that helped them pick which of our common-use machines to use at any given time would be very helpful. I had no idea if or what the admin people had that might be similar. So I built a system. At first, it was a shell command that automatically logged the user on to the machine with the lowest load and opened an instance of emacs. Since the information that program used was stored in a file (collected every five minutes by polling the machines in question), I created a web page that displayed the stats.

      Several users got pretty psyched about the web page. I guess they run a lot of heavy duty Monte Carlo simulations.

      Since I got positive feedback I've been working on an upgrade that uses a DB back end and stores minutely status data for the 18 machines. Eventually I hope to be able to create historical graphs of the data. The admin folks do have something similar, but don't advertise it.

      The reason I say you're prescient is because I was poking around just today and found a shell command someone wrote that basically wgets a nagios web page displaying a list of hosts that looks something like

      host1 is UP
      host2 is DOWN
      etc.
      
      I have no idea how timely that data is or how it's created.

      I don't generally track how quickly the hosts respond to the pings, I just needed to make sure they were up before I tried to ssh to them and execute the script that collected and reported the statistics. I was going to say I don't care, but now that you mention it, it does seem (and this is pure speculation based on anecdotal evidence) that the machines that have a higher load tend to take longer to reply and to return the stats. That info might be worth recording. Sort of like finding the best Counterstrike server!


      I like computer programming because it's like Legos for the mind.
        Eventually I hope to be able to create historical graphs of the data.
        Smokeping may give you some ideas. It uses RRDtool, a round-robin data storage/retrieval/graphing facility, which is particularly well suited for the type of data the resolution of which becomes less important as it ages. Cheers.