in reply to Re^7: Using kernel-space threads with Perl (disorder)
in thread Using kernel-space threads with Perl
checked your work and found several mistakes, at least one of which likely would drastically change your numbers.
Innuendo. You suggest "mistakes", but typically don't identify them so they cannot be countered.
There is one trivial mistake, -l, that has exactly NO effect on the performance.
My investigation shows ...
Typically, no code for others to try; no indication of your testing methodology; nothing but your words in support of your claims.
If you never post anything but 'authoritative testimony', you cannot be countered. Way to go with maintaining your mystique.
The non-linear slow-down of Thread::Queue is interesting and might be worth investigating.... I wish I had 4 cores to play with.
And right there in a nutshell is the endemic problem: Guesswork. Of course there is a non-linear slowdown.
But until you've used threads & queues in earnest a few times, the reason probably won't be obvious to you. Namely: lock-contention & context thrash.
Simplistically, the lock contention arises when one end of a given queue attempts to use it, when it is currently in use by the other end. So, when one of the worker threads is running on its "own" core reading from the queue, that queue is almost permanently locked. It only becomes unlocked for a brief period at the end of each loop cycle.
That means that when the file reading thread comes to write a record to that queue, it will, with a high probability, find that the queue is locked, and so enter a wait-state. That wait-state inevitably means that some other thread (or process) will inherit the core it was using and so a context switch occurs. The reader will not get another timeslice until at least that thread relinquishes it. And, it is entirely possible that it won't get a look in until several other threads have had their turn. And when it does, it may well find the queue still blocked and the cycle repeats.
Now, it should be obvious to even you how those delays could add up to produce the slow downs I indicated. And why all current parallelisation research is looking into lock-free data-structures and mechanisms.
To demonstrate, using this trivially modified form of the code above:
#! perl -sw use strict; use threads; use Thread::Queue; use Time::HiRes qw[ time ]; sub worker { my $Q = shift; while( my $line = $Q->dequeue ) { $line =~ s[a][A]g; print $line; } } print "Started " . scalar locatime; our $T //= 4; my $n = $T -1; my @Qs = map Thread::Queue->new, 0 .. $n; my @threads = map{ threads->create( \&worker, $Qs[ $_ ] ) } 0 .. $n; my $start = time; my $i = 0; while( <> ) { $Qs[ $i ]->enqueue( $_ ); $i = ( $i + 1 ) % $T; warn $., ' ', $. / ( time - $start ) unless $. % 1e6; } $Qs[ $_ ]->enqueue( undef ) for 0 .. $n; $_->join for @threads; print "Ended " . scalar locatime;
and a 1 million line file produced for the purpose:
perl -e"printf qq[line %6d:the quick brown fox jumps over the lazy dog +\n], $_ for 1 .. 1e6" >phrases.small C:\test>dir phrases.small ... 23/03/2011 03:19 57,000,000 phrases.small ...
Now the runs using 1, 2, 3, & 4 threads and queues:
C:\test>junk71 -T=1 phrases.small >out.txt Started Wed Mar 23 03:19:58 2011 1000000 19484.8213140505 at C:\test\junk71.pl line 30, <> line 1000000 +. Ended Wed Mar 23 03:20:51 2011 C:\test>junk71 -T=2 phrases.small >out.txt Started Wed Mar 23 03:20:59 2011 1000000 18001.4760827208 at C:\test\junk71.pl line 30, <> line 1000000 +. Ended Wed Mar 23 03:21:55 2011 C:\test>junk71 -T=3 phrases.small >out.txt Started Wed Mar 23 03:27:32 2011 1000000 15906.3434420465 at C:\test\junk71.pl line 30, <> line 1000000 +. Ended Wed Mar 23 03:28:35 2011 C:\test>junk71 -T=4 phrases.small >out.txt Started Wed Mar 23 03:30:32 2011 1000000 15088.645826608 at C:\test\junk71.pl line 30, <> line 1000000. Ended Wed Mar 23 03:31:38 2011
And now for a single-core, sequential run:
perl -MTime::HiRes=time -E"BEGIN{warn $t=time, }" -pe"s[a][A]g; }{ warn $./(time-$t); warn time" phrases.small >out.txt 1300851145.745 at -e line 1. 977517.104726817 at -e line 2, <> line 1000000. 1300851146.77491 at -e line 2, <> line 1000000.
And now a little math.
The best of these is 977517.104726817 / 19484.8213140505 = 50.168132874892193598990165428789 50 times slower.
The worst of these is 977517.104726817 / 15088.645826608 = 64.784945975935041000983695239624 64 times slower.
The difference between 64 and my original figures from which you derived x120, is that these estimates are based on elapsed time, whereby my original figures are based upon actual cpu usage.
To counter your scurrilous "foobar"'d system possibility, I've had the same tests performed (albeit with a different randomly generated 1 million line dataset) on a dual core systems running both Windows/Perl 32-bit and linux, and those results are broadly in line with my results above. My friend & I hope that we will be able to repeat our provisional tests later this week using exactly the same datasets and a wider range of systems and setups, but based on our results to date, there is nothing to support your claimed "research".
As I have no idea what your "research" consisted of, because of course, you chose not to tell us. I can therefore not speculate about what mistakes or miss-assumptions you may have made whilst doing it, which is probably why you didn't tell us.
But, as there is small chance that three entirely different Perl installations--spanning compilers, OSs, hardware and continents--are "foobar"d in ways that make them exhibit strikingly similar relative performance, I'm confident to state that I think your "research" is more than just a little suspect. And your pronouncements based upon it, little more than guesswork and innuendo.
So, the next time you allude to "research" in order to support your mind's eye expertise, try making it something worthy of that epithet.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^9: (Innuendo and guesswork)
by tye (Sage) on Mar 23, 2011 at 16:34 UTC | |
by BrowserUk (Patriarch) on Mar 23, 2011 at 18:07 UTC | |
by tye (Sage) on Mar 23, 2011 at 23:15 UTC | |
by BrowserUk (Patriarch) on Mar 24, 2011 at 05:11 UTC | |
by ikegami (Patriarch) on Mar 24, 2011 at 18:48 UTC |