Re: I think I just found a good reason to use backticks in a void context.
by merlyn (Sage) on Jan 14, 2007 at 01:26 UTC
|
The reduction in time is not because of backticks, but because you're avoiding the shell metachars, meaning that Perl can go directly to the command, rather than forking the shell first.
If you made the test fair, by perhaps adding a semicolon to the end of your backtick command, you'd see that you're saving nothing, and perhaps even spending more time capturing things that you're just going to ignore.
| [reply] |
|
|
| [reply] |
|
|
Right... one that's more fair would be to fork yourself and close STDOUT and STDERR, which would probably be even faster. Something like:
defined(my $kidpid = fork) or die "cannot fork: $!";
unless ($kidpid) {
close STDOUT;
close STDERR;
exec "your", "multiarg", "command", "here";
die "$!";
}
waitpid($kidpid, 0);
That'd be the equivalent to your backtick-that-just-happens-to-not-invoke-the-shell, but probably even more efficient.
| [reply] [d/l] |
|
|
Using backticks is not the only way of avoiding the shell. Pass a list to system (or to a three-arg pipe-open for convenient output suppression) and you get the same effect.
So no, this isn’t a reason to use backticks in void context; it is merely a reason to avoid invoking the shell when you don’t need it. Even if backticks were required for that, saving a couple of microseconds here and there wouldn’t constitute a good reason to do it, as per the topic.
Makeshifts last the longest.
| [reply] |
|
|
I think merlyn's point is that in the OP, you seem to be claiming that it is using backticks itself which makes the progam faster, whereas he is pointing out that the actual cause is the lack of metacharacters. So if you were to use system instead of backticks, but still without redirection or other metacharacters, you would get the same benefit. Therefore it's not really fair to claim that the backticks made the difference.
| [reply] |
|
|
Re: I think I just found a good reason to use backticks in a void context.
by talexb (Chancellor) on Jan 14, 2007 at 01:32 UTC
|
Using something in a void context means you're throwing away any return value. This may or may not be a good idea. (Hint: it's almost *never* a good idea.)
In this particular example, since Net::Ping is available as part of the Core distribution, I'd suggest using that module instead of anything that involves shelling out to the OS.
From the man page:
This module contains methods to test the reachability of remote hosts
on a network. A ping object is first created with optional parameters,
a variable number of hosts may be pinged multiple times and then the
connection is closed.
Sounds like it's right up your alley.
Alex / talexb / Toronto
"Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds
| [reply] [d/l] |
|
|
Thanks for your response.
When you use backticks in a void context, aren't you throwing away the output, while the return value is retained in $?, as with system()?
Either way, Net::Ping might well be even faster. I'll look into it.
I like computer programming because it's like Legos for the mind.
| [reply] |
Re: I think I just found a good reason to use backticks in a void context.
by OfficeLinebacker (Chaplain) on Jan 14, 2007 at 02:34 UTC
|
almut, thanks for the idea. I have to admit, I am very wary of closing STDOUT, even if the job is running in cron.
In keeping with my offer, I tried a solution using Net::Ping, which, indeed, looks like exactly what I am looking for, as talexb mentioned.
Here are some preliminary results. I used
#some stuff like use strict, etc.
Readonly my $udp_pinger => Net::Ping->new('udp',1);
#other stuff, then beginning of loop
my $t0 = [ gettimeofday() ];
`ping -q -w 1 -c 1 $host`;
die if ($?);
my $e0 = tv_interval($t0);
print "Ping with backticks took $e0 seconds";
$t0 = [ gettimeofday() ];
$udp_pinger->ping($host) || die "udp ping failed for $host";
$e0 = tv_interval($t0);
print "Ping with Net::Ping took $e0 seconds";
print "..";
#end of the loop
I put the return value checks in the timed segments to make the comparison as "fair" as possible. Also note that the program is a loop over 18 distinct hosts, not the same host repeatedly.
Results:
Ping with backticks took 0.004804 seconds
Ping with Net::Ping took 0.002682 seconds
..
Ping with backticks took 0.005873 seconds
Ping with Net::Ping took 0.001528 seconds
..
Ping with backticks took 0.005679 seconds
Ping with Net::Ping took 0.001115 seconds
..
Ping with backticks took 0.005465 seconds
Ping with Net::Ping took 0.001108 seconds
..
Ping with backticks took 0.005369 seconds
Ping with Net::Ping took 0.001098 seconds
..
Ping with backticks took 0.005465 seconds
Ping with Net::Ping took 0.001521 seconds
..
Ping with backticks took 0.006113 seconds
Ping with Net::Ping took 0.001071 seconds
..
Ping with backticks took 0.005269 seconds
Ping with Net::Ping took 0.001227 seconds
..
Ping with backticks took 0.005122 seconds
Ping with Net::Ping took 0.001262 seconds
..
Ping with backticks took 0.005421 seconds
Ping with Net::Ping took 0.00123 seconds
..
Ping with backticks took 0.00513 seconds
Ping with Net::Ping took 0.001106 seconds
..
Ping with backticks took 0.00501 seconds
Ping with Net::Ping took 0.001211 seconds
..
Ping with backticks took 0.005031 seconds
Ping with Net::Ping took 0.000958 seconds
..
Ping with backticks took 0.005107 seconds
Ping with Net::Ping took 0.001256 seconds
..
Ping with backticks took 0.005139 seconds
Ping with Net::Ping took 0.000981 seconds
..
Ping with backticks took 0.005484 seconds
Ping with Net::Ping took 0.001086 seconds
..
Ping with backticks took 0.003778 seconds
Ping with Net::Ping took 0.000643 seconds
Booyah! Knocks the socks off of what I had so far. I don't know what the overhead of using the module and/or instantiating the Net::Ping object are, so the comparison still isn't "fair" until you figure those and amortize them over the actual calls to ping(), but I am comfortable with the result being better than what I had. (And for more reasons than it just being faster)
Also, the data above are (IMO) a good representative sample, and one thing I have noticed is that the first call is always the slowest. I don't know what kind of optimization happens, but the second and later pings are (so far) always at least twice as fast as the first.
Finally, pretty convenient that 'backticks' and 'Net::Ping' have the same number of characters, eh?
I like computer programming because it's like Legos for the mind.
| [reply] [d/l] |
|
|
If you want this to run even faster, consider using NET::Ping with the 'syn' protocol. This allows you to overlap the pings to multiple hosts by not waiting for each to respond before starting the next.
See fping for an example of this.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] |
|
|
Hi, Browser, I saw that; but I saw this in the doc:
"NOTE: Unlike the other protocols, the return value does NOT determine if the remote host is alive or not since the full TCP three-way handshake may not have completed yet."
So I guess the thing to do would be to ping the hosts outside the loop and then only loop through the hosts that respond? I guess you lose a little bit of timeliness (the whole script takes about 20 seconds to run). So it's possible that when I get to the last machine in my list, I'm working on a ping success value from ~19 seconds ago. Also, a quick benchmark of the syn vs udp protocols (similar to the one where I bm'd backticks vs Net::Ping) actually showed no advantage (some were faster, some were slower) (and that's just the outward leg!):
Ping with Net::Ping (syn) took 0.002087 seconds
Ping with Net::Ping (udp) took 0.000932 seconds
..
Ping with Net::Ping (syn) took 0.000977 seconds
Ping with Net::Ping (udp) took 0.001136 seconds
..
Ping with Net::Ping (syn) took 0.001145 seconds
Ping with Net::Ping (udp) took 0.001183 seconds
..
Ping with Net::Ping (syn) took 0.001282 seconds
Ping with Net::Ping (udp) took 0.001024 seconds
..
Ping with Net::Ping (syn) took 0.001163 seconds
Ping with Net::Ping (udp) took 0.001019 seconds
..
Ping with Net::Ping (syn) took 0.001049 seconds
Ping with Net::Ping (udp) took 0.000696 seconds
..
Ping with Net::Ping (syn) took 0.00084 seconds
Ping with Net::Ping (udp) took 0.000815 seconds
..
Ping with Net::Ping (syn) took 0.000822 seconds
Ping with Net::Ping (udp) took 0.000707 seconds
..
Ping with Net::Ping (syn) took 0.000956 seconds
Ping with Net::Ping (udp) took 0.000822 seconds
..
Ping with Net::Ping (syn) took 0.000828 seconds
Ping with Net::Ping (udp) took 0.000734 seconds
..
Ping with Net::Ping (syn) took 0.000802 seconds
Ping with Net::Ping (udp) took 0.000704 seconds
..
Ping with Net::Ping (syn) took 0.000835 seconds
Ping with Net::Ping (udp) took 0.000821 seconds
..
Ping with Net::Ping (syn) took 0.000822 seconds
Ping with Net::Ping (udp) took 0.000712 seconds
..
Ping with Net::Ping (syn) took 0.000702 seconds
Ping with Net::Ping (udp) took 0.000824 seconds
..
Ping with Net::Ping (syn) took 0.000823 seconds
Ping with Net::Ping (udp) took 0.000719 seconds
..
Ping with Net::Ping (syn) took 0.00104 seconds
Ping with Net::Ping (udp) took 0.000782 seconds
..
Ping with Net::Ping (syn) took 0.000563 seconds
Ping with Net::Ping (udp) took 0.000474 seconds
It seems to me a better way to test the performance is something like this:(it's late so forgive me if my logic is off)
Readonly my $udp_pinger => Net::Ping->new( 'udp', 1 );
Readonly my $syn_pinger => Net::Ping->new( 'syn', 1 );
foreach my $x ( 0 .. 5 ) {
#test the syn prot with two loops
my $t0 = [ gettimeofday() ];
foreach my $host (@host_list) {
$syn_pinger->ping($host) || die "syn ping failed for $host";
}
foreach my $host (@host_list) {
$syn_pinger->ack($host) || die "syn ack failed for $host";
}
my $e0 = tv_interval($t0);
print "$e0 seconds to ping $NUM_HOSTS hosts with syn";
#test the udp prot with one loop
$t0 = [ gettimeofday() ];
foreach my $host (@host_list) {
$udp_pinger->ping($host) || die "udp ping failed for $host";
}
$e0 = tv_interval($t0);
print "$e0 seconds to ping $NUM_HOSTS hosts with udp";
print "---------------------------";
} ## end foreach my $x ( 0 .. 5 )
(I didn't want to go too crazy with number of iterations, but here's a pretty representative result:
0.015293 seconds to ping 17 hosts with syn
0.013442 seconds to ping 17 hosts with udp
---------------------------
0.013814 seconds to ping 17 hosts with syn
0.013393 seconds to ping 17 hosts with udp
---------------------------
0.01381 seconds to ping 17 hosts with syn
0.01327 seconds to ping 17 hosts with udp
---------------------------
0.014273 seconds to ping 17 hosts with syn
0.013056 seconds to ping 17 hosts with udp
---------------------------
0.014191 seconds to ping 17 hosts with syn
0.013396 seconds to ping 17 hosts with udp
---------------------------
0.016453 seconds to ping 17 hosts with syn
0.013 seconds to ping 17 hosts with udp
I suppose a better test would be just just call ack non-specifically the right number of times, and keep track of who's responded. But that seems like a lot of work to save a few milliseconds (yeah, I know--kinda late to be saying that). I'm happy going from 1.008->.008->.005->.0015 seconds per host. I think I've reached the point of diminishing returns. However, if you were checking hundreds or more hosts, I could see where that could be worth it.
I like computer programming because it's like Legos for the mind.
| [reply] [d/l] |
|
|
Re: I think I just found a good reason to use backticks in a void context.
by almut (Canon) on Jan 14, 2007 at 02:02 UTC
|
How about temporarily closing STDOUT? This way you can avoid having
to redirect to /dev/null, and more importantly, you now no longer need
shell meta characters (which in turn avoids spawning a shell for system())
Something like this:
open ORIGOUT, ">&STDOUT";
close STDOUT;
my $ok = ! system qw"ping -q -w 1 -c 1", $host;
open STDOUT, ">&ORIGOUT";
print $ok ? "OK\n" : "Failed\n";
| [reply] [d/l] [select] |
Re: I thought I found a good reason to use backticks in a void context, but I was wrong.
by talexb (Chancellor) on Jan 14, 2007 at 18:18 UTC
|
If you are regularly pinging hosts, is this actually something that could better be handled by something like Nagios? This assumes that you're interested in a) whether the hosts are up or not, and b) what the ping performance is.
Alex / talexb / Toronto
"Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds
| [reply] |
|
|
Hey, talexb, you are prescient. The administrators of our network do run nagios, but they don't expose much of the interface to the users. I have no admin privileges. My collegaues expressed that a program that helped them pick which of our common-use machines to use at any given time would be very helpful. I had no idea if or what the admin people had that might be similar. So I built a system. At first, it was a shell command that automatically logged the user on to the machine with the lowest load and opened an instance of emacs. Since the information that program used was stored in a file (collected every five minutes by polling the machines in question), I created a web page that displayed the stats.
Several users got pretty psyched about the web page. I guess they run a lot of heavy duty Monte Carlo simulations.
Since I got positive feedback I've been working on an upgrade that uses a DB back end and stores minutely status data for the 18 machines. Eventually I hope to be able to create historical graphs of the data. The admin folks do have something similar, but don't advertise it.
The reason I say you're prescient is because I was poking around just today and found a shell command someone wrote that basically wgets a nagios web page displaying a list of hosts that looks something like
host1 is UP
host2 is DOWN
etc.
I have no idea how timely that data is or how it's created.
I don't generally track how quickly the hosts respond to the pings, I just needed to make sure they were up before I tried to ssh to them and execute the script that collected and reported the statistics. I was going to say I don't care, but now that you mention it, it does seem (and this is pure speculation based on anecdotal evidence) that the machines that have a higher load tend to take longer to reply and to return the stats. That info might be worth recording. Sort of like finding the best Counterstrike server!
I like computer programming because it's like Legos for the mind.
| [reply] |
|
|
Eventually I hope to be able to create historical graphs of the data.
Smokeping may give you some ideas. It uses RRDtool, a round-robin data storage/retrieval/graphing facility, which is particularly well suited for the type of data the resolution of which becomes less important as it ages. Cheers.
| [reply] |