in reply to Re^7: OT How fast a cpu to overwhelm Time::HiRes
in thread OT How fast a cpu to overwhelm Time::HiRes

Perl never does it's own context switching.

Aah, of course! Thanks (I used to know that, I can now remember discussing it with some Java guys a while ago).

except you would have to change Win32::Sleep 0; for yield
Thanks, I don't normally use threads, I tried using sleep and usleep to get the threads to switch. It works with yield, I ran your code on two different systems, a dual-Xeon 2.8GHz and an AMD64 2GHz, here's what I got (average figures, spikes were up to +/- 20%)
Xeon AMD64 single process, empty loop 420,000/s 360,000/s two processes, empty loop 330,000/s 220,000/s single process, gettimeofday 230,000/s 110,000/s two processes, gettimeofday 180,000/s 87,000/s

But if I change that code to use fork and processes, and add an explicit call to the sched_yield system call via Inline::C

#!perl -slw use strict; use threads; use POSIX qw( WNOHANG ); use Time::HiRes qw(gettimeofday); use Inline C => <<'END_OF_CODE'; #include <sched.h> void yield_me() { sched_yield(); } END_OF_CODE my (@time,$a); sub thread{ while (1) { # gettimeofday(); yield_me() } } my @forks = map{ if (my $pid=fork){waitpid(-1,WNOHANG)}elsif($pid==0){ +thread()}else{die "Cannot fork"} } 1 .. 10; <STDIN>;

I get these results:

Xeon AMD64 single process, empty loop 865,000/s 550,000/s two processes, empty loop 845,000 525,000/s single process, gettimeofday 345,000 130,000/s two processes, gettimeofday 340,000 125,000/s

which is quite a bit better.

I changed my code to use explicit yield with threads and sched_yield with forks as well and got down to 4 microseconds minimum delay for both on the single-processor system with the older kernel(AMD Athlon 2,2GHz). So maybe I should retract my earlier retraction ;-), it would appear that with a system that's faster than mine (or a kernel that's better tuned to this, e.g. an RT kernel) it could be possible to get duplicate calls from forked processes.

The really interesting thing about these shenanigans is that context switching between processes seems to be a lot faster than between Perl threads under Linux (with 2.6 kernel, I should add that I ran some of this on a 2.4 system as well and the results were not as good). I'll need to do some more benchmarks, I think.


Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. -- Brian W. Kernighan

Replies are listed 'Best First'.
Re^9: OT How fast a cpu to overwhelm Time::HiRes
by BrowserUk (Patriarch) on Dec 01, 2005 at 15:00 UTC

    If I drop the number of threads to 10 as shown in your snippet, the context switches goes up to the 860k mark. Drop it to 2, and it averages out at over 1,100,000/s. Both for the empty loop.

    Adding back the gettimeofday() and the best I can achieve if 170,000 even with just 2 threads and a single process.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      Ah, right, that would be it. Here are the numbers for 100 concurrent processes with fork (the thread measurement above was done with 100 threads).

      Xeon AMD64 single process, empty loop 500,000/s 220,000/s two processes, empty loop 340,000/s 185,000/s single process, gettimeofday 235,000/s 85,000/s two processes, gettimeofday 190,000/s 80,000/s

      Still better than threads for the SMP system, not as good for the single-processor one. Which is as expected, so nothing to see here ;-).


      Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. -- Brian W. Kernighan