in reply to Re^7: If I am tied to a db and I join a thread, program chrashes
in thread If I am tied to a db and I join a thread, program chrashes
The term $count / ($N * $N * (time - $t)) is how many multiplications (or matrix cells) per second the benchmark can process.
I thought that at first, but then I looked closer.
while( my $r = $qr->dequeue ) { ++$count; if ($count == 63) { $t = time; }
At the very least that inflates the benchmark values a little.
elsif (($count & 63) == 0) { if (time > $t + 5) { printf "%f\n", $count / ($N * $N * (time - $t)); last; } } }
Why would it do that? And the answer is, because it improves the performance of Coro!
As Coro threads are cooperative, if the timing thread called the relatively expensive built-in time each iteration, the cpu used to process that opcode, would directly detract from the time the other threads spent doing multiplications.
That operation effectively reduces the number of expensive calls to time by a factor of 64. As iThreads are preemptive, they don't need or benefit from the reduction in the number of calls to time, making it a performance multiplier for the Coro threads only. Talk about weighting the die.
I don't know what the copying is about. It makes sense to unshare data if you do a lot of operations on it because shared data is slower, but one multiplication isn't a lot. Maybe this was included to simulate some optimizations.
Hm. Sorry, but that doesn't make any sense at all.
To multiply the 50 pairs of values, means accessing each shared value once. 100 shared accesses. To copy those values to non-shared memory means accessing each shared value once to copy them to non-shared memory--100 shared accesses. But additionally, you have to: a) allocate the non-shared arrays; b) copy the values to that non-shared memory; c) access the non-shared values once each to do the multiply.
Ditto with the results. Instead of just writing them directly to shared memory, he 1) allocates non-shared; 2) writes to non-shared; 3) allocates shared; 4) copies non-shared to shared.
So, to avoid 100 'slow operations'; he does: 3*100 allocations; 100 read from shared (the 'slow operations' he was trying to avoid!); 100 writes to non-shared; 200 reads from non-shared; 100 writes to shared. Not so much of an optimisation.
And note. All of these shenanigans only happens on the iThreads side of the benchmark.
Cleaning up of the queues is not necessary because (again) this isn't a real solution to a problem but a synthetic benchmark.
But that (again) is a totally Coro-biased view of things.
With 4 (preemptive) threads continually generating matrices--breaking them up into chunks (that will never be processed); sharing them and firing them into a shared queue (injecting sync points into all the threads that have visibility of that queue; ie. all of them)--they are directly competing for the CPUs with the 4 threads that are meant to be doing the work.
Continually adding more and more data to the queue that's never going to be processed, means constantly reallocating and copying the underlying shared array, as the queue size doubles and redoubles in size. It's like timing how long it takes to stick labels on boxes whilst the operator is continuously having to unload and reload the boxes he's already done onto bigger and bigger lorries. The time taken to affix the labels is entirely lost in the noise of everything else he is having to do.
And again, the bias is all in favour of Coro, because with limited size queues the Coro generating threads will have blocked long before he ever starts timing. Ie. 63 * 50 = 3,150 > 512
The only result anyone is interested in is the time it takes.
In that case, I offer:
perl -E"$s=time(); $i=0; ++$c, rand()*rand() while time() < $s+5; say +$c/5" 3210038.8
Let's see Coro compete with that! It's just as (un)fair a comparison as this benchmark.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^9: If I am tied to a db and I join a thread, program chrashes
by marioroy (Prior) on Feb 18, 2013 at 23:31 UTC | |
|
Re^9: If I am tied to a db and I join a thread, program chrashes
by jethro (Monsignor) on Jun 11, 2009 at 12:33 UTC | |
by BrowserUk (Patriarch) on Jun 11, 2009 at 14:29 UTC | |
by jethro (Monsignor) on Jun 12, 2009 at 02:12 UTC | |
by Anonymous Monk on Jul 04, 2009 at 16:20 UTC | |
by BrowserUk (Patriarch) on Jul 04, 2009 at 17:26 UTC | |
by jethro (Monsignor) on Jul 05, 2009 at 13:42 UTC |