Re^8: OT How fast a cpu to overwhelm Time::HiRes

Perl never does it's own context switching.

Aah, of course! Thanks (I used to know that, I can now remember discussing it with some Java guys a while ago).

except you would have to change Win32::Sleep 0; for yield

Thanks, I don't normally use threads, I tried using sleep and usleep to get the threads to switch. It works with yield, I ran your code on two different systems, a dual-Xeon 2.8GHz and an AMD64 2GHz, here's what I got (average figures, spikes were up to +/- 20%)

                               Xeon         AMD64
single process, empty loop     420,000/s    360,000/s
two processes, empty loop      330,000/s    220,000/s
single process, gettimeofday   230,000/s    110,000/s
two processes, gettimeofday    180,000/s    87,000/s
[download]

But if I change that code to use fork and processes, and add an explicit call to the sched_yield system call via Inline::C

#!perl -slw
use strict;
use threads;
use POSIX qw( WNOHANG );
use Time::HiRes qw(gettimeofday);
use Inline C => <<'END_OF_CODE';
#include <sched.h>

void yield_me() {
    sched_yield();
  }
END_OF_CODE


my (@time,$a);
sub thread{
  while (1) {
#    gettimeofday();
    yield_me()
  }
}
my @forks = map{ if (my $pid=fork){waitpid(-1,WNOHANG)}elsif($pid==0){
+thread()}else{die "Cannot fork"} } 1 .. 10;

<STDIN>;
[download]

I get these results:

                               Xeon        AMD64
single process, empty loop     865,000/s   550,000/s
two processes, empty loop      845,000     525,000/s
single process, gettimeofday   345,000     130,000/s
two processes, gettimeofday    340,000     125,000/s
[download]

which is quite a bit better.

I changed my code to use explicit yield with threads and sched_yield with forks as well and got down to 4 microseconds minimum delay for both on the single-processor system with the older kernel(AMD Athlon 2,2GHz). So maybe I should retract my earlier retraction ;-), it would appear that with a system that's faster than mine (or a kernel that's better tuned to this, e.g. an RT kernel) it could be possible to get duplicate calls from forked processes.

The really interesting thing about these shenanigans is that context switching between processes seems to be a lot faster than between Perl threads under Linux (with 2.6 kernel, I should add that I ran some of this on a 2.4 system as well and the results were not as good). I'll need to do some more benchmarks, I think.

Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. -- Brian W. Kernighan

Comment on Re^8: OT How fast a cpu to overwhelm Time::HiRes Select or Download Code

Replies are listed 'Best First'.
Re^9: OT How fast a cpu to overwhelm Time::HiRes by BrowserUk (Patriarch) on Dec 01, 2005 at 15:00 UTC
If I drop the number of threads to 10 as shown in your snippet, the context switches goes up to the 860k mark. Drop it to 2, and it averages out at over 1,100,000/s. Both for the empty loop. Adding back the gettimeofday() and the best I can achieve if 170,000 even with just 2 threads and a single process. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal? "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice.	[reply]
Re^10: OT How fast a cpu to overwhelm Time::HiRes by tirwhan (Abbot) on Dec 01, 2005 at 15:45 UTC
Ah, right, that would be it. Here are the numbers for 100 concurrent processes with fork (the thread measurement above was done with 100 threads). `Xeon AMD64 single process, empty loop 500,000/s 220,000/s two processes, empty loop 340,000/s 185,000/s single process, gettimeofday 235,000/s 85,000/s two processes, gettimeofday 190,000/s 80,000/s` [download] Still better than threads for the SMP system, not as good for the single-processor one. Which is as expected, so nothing to see here ;-). Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. -- Brian W. Kernighan	[reply] [d/l]