in reply to OT How fast a cpu to overwhelm Time::HiRes

I seriously doubt that it is possible to get two identical timestamps from Time::HiRes on a single CPU system. At least not under Win32, and I doubt it is much different under other OSs.

Under Win32, the mechanism underlying the timestamps is the QueryPerformanceCounter() API. It has a companion call, QueryPerformanceFrequency() which tells you how quickly the raw timer changes. On my 2.66 GHz system, this frequency is reported as 3579545 ticks/second. This second call is required because the speed of the counter underlying the timing mechanism is tied to the speed of the processor. As the processor gets faster, so does the counter.

Calling the api directly from within Perl, I get

P:\test>perl qpc.pl t/S: 3579545 Ticks1: 4139490207899 Ticks2: 4139490207925 Diff : 26 P:\test>perl qpc.pl t/S: 3579545 Ticks1: 4139494180449 Ticks2: 4139494180475 Diff : 26 P:\test>perl qpc.pl t/S: 3579545 Ticks1: 4139496939489 Ticks2: 4139496939517 Diff : 28 P:\test>perl qpc.pl t/S: 3579545 Ticks1: 4139500850229 Ticks2: 4139500850256 Diff : 27

Showing that even bypassing the code in Time::HiRes, 2 consecutive calls take at least 25 ticks. On my system that translates into approximately 25 / 3579545 = 0.00000698 seconds, or at 2.66GHz, roughly 18500 cpu cycles. On modern processors, that can often equate to 18500 cpu instructions.

If I drop into C

P:\test>qpc Frequency: 3579545 T1: 823060571 T2: 823060575 P:\test>qpc Frequency: 3579545 T1: 829532517 T2: 829532521 P:\test>qpc Frequency: 3579545 T1: 833572712 T2: 833572716

The closest I get is 4 ticks, or roughly 0.0000011 seconds or 3000 cpu cycles/instructions.

Whilst there are many faster processors than mine, the salient point is that as the processor gets faster, the frequency of the counter will also increase.

So the possibility for you being able to get two identical timestamps from within Perl on a single processor system, given the extra cycles that getting from perl to the hardware and back again involves, seems pretty unlikely.

YMMV on other OSs and hardware.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^2: OT How fast a cpu to overwhelm Time::HiRes
by tirwhan (Abbot) on Nov 30, 2005 at 16:22 UTC

    Ah, but one of the OP's questions was whether this would be possible with two distinct processes. And I think it is, even on a single processors system, because the OS can switch contexts between processes at any time, even in the middle of a system call (well, not at any time, certain code paths are not preemptible, but nearly enough for this matter).


    Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. -- Brian W. Kernighan
      because the OS can switch contexts between processes at any time,

      But switching processes is not cheap. Take a good look at the code at the heart of your OS kernel and see what is involved with switching processes.

      All the saving and restoring of registers alone is not insubstantial, but before you get to all of that you have to go through the mechanics of deciding which process is next to run. This involves some sort of prioritised queue mechanism. You also have to update any dynamic priorities (eg. foreground boost), check for whether the next round-robin process within the current priority arbitration level is eligable to run.

      Is it sleeping or in an IO wait state, etc.

      And once you chosen the next process to run, you have to check whether it has been swapped out, and potentially shuffle memory to and from disk. Did the process swap invalidate any COW memory that now needs replicating? And almost every process swap is going to cause the processor to stall while the l2 cache is refreshed. Even kernel-level thread swaps involve a substantial amount of housekeeping by the kernel. Less than a process, but still substantial.

      Process swaps are not instantaneous. Can they be done in less than 2000 cycles/instructions? In C it's vaguely possible, but 18500 for perl?


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        But switching processes is not cheap.

        That depends on your OS ;-). Linux tries to make process context switches extremely cheap (and as a result you can get away with using processes instead of threads for parallel performance). This recent mail on LKML states that on a 3GHz P4 the 2.6 kernel can do up to 700,000 process context switches per second. That's only if the processes do nothing except switch, the mail goes on to explain that under normal workloads you'd only get about 10,000 switches per second. I just took a quick look around the machines I have at hand, and I found one which reports an average of ~60,000 cs/s over the period of fifteen minutes (via sar -w). Running the lat_ctx benchmark from the lmbench suite on an AMD64 machine gives me a minimum context switch overhead of 0.55 microseconds. Given these figures it seems conceivable to me that a context switch can take place in significantly under a microsecond on an extremely fast processor with large cache. I'd agree, this is definitely not something you'd expect to happen, but it seems possible.

        Anyway, code walks as they say, here's a little script which forks off a couple of processes and tries to get the same gettimeofday in different children:

        #!/usr/bin/perl use strict; use warnings; use Time::HiRes qw(gettimeofday usleep); my $parent_time=(gettimeofday)[0]+5; my $children=10; my $measurements=5000; my $pid; for my $child (1..$children) { if ($pid=fork()) { } elsif (defined $pid) { my ($times,@temp_times); #Make all children start measuring as nearly simultaneously as + we can while(1){ last if ((gettimeofday)[0]>$parent_time); usleep 1; } # Get time measurements for (1..$measurements) { @temp_times=gettimeofday(); $times.=$temp_times[0].sprintf("%06d",$temp_times[1])."\n" +; usleep 2; } sleep 10; PrivoxyWindowOpen(my $record,">","timerecord$child") or die "C +an't open record file"; print $record $times; close $record or die "Can't close record file"; exit; } else { die("Cannot fork"); } } # Wait for children to finish my $kid; do { $kid = waitpid(-1, 0); } until $kid > 0; # Put measurements into a hashtable and end if any duplicates are foun +d my %measured; for my $child (1..$children) { PrivoxyWindowOpen(my $record,"<","timerecord$child") or die "Can't + open record file"; while(<$record>) { chomp; if (exists($measured{$_})) { print "Found duplicate: $_, gettimeofday returned the same + value in child $child and $measured{$_}\n"; exit; } $measured{$_}=$child; } close $record or die "Can't close record file"; } # Check for shortest time passed between two measurements my $difference=42; my ($t1,$t2); my $last_i=0; my $last_t=0; for my $t (sort keys %measured) { next if($last_i == $measured{$t}); my $cur_diff=$t-$last_t; if ($cur_diff<$difference) { $difference=$cur_diff; ($t1,$t2)=($last_t,$t); } $last_t=$t; $last_i=$measured{$t}; } print "Found minimum delay of $difference between $t2 - child $measure +d{$t2} and $t1 - child $measured{$t1}\n";

        On SMP machines this easily finds duplicate measurements, so, no surprise there, calls to gettimeofday can return the same value from different processes on SMP. The smallest time period I was able to achieve on a single-processor machine was 11 microseconds. Strangely enough this was not the fastest CPU I tried it on by far, so I suspect it has something to do with the Linux kernel version (this one is running the Debian 2.6.8 kernel, whereas all others have newer versions).

        So, at least from this practical test it appears you are right, processes don't switch quickly enough for Time::HiRes to return the same result.


        Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it. -- Brian W. Kernighan