Your driving description actually sets up a perfect example of what I'm trying to say. When someone first wants to drive, we do not hand them keys and point them to the freeway for them to figure it out.
...
Without knowing the problem the OP is trying to solve, I couldn't recommend threads as the right solution. Any more than I would recommend driving without knowing where someone needs to go (and what the terrain looks like.)
Once again, I would say that we are in near complete agreement. I wouldn't just hand a beginner the keys to the car either. So, stretching the analogy a little further, think of this place as a driving school!
And if you'll excuse me for self-referencing, my initial response to the OP took exactly that tack:
The answer to both questions is: it depends upon the program. Some programs lend themselves to multi-threading and the conversion can be quite simple. Others, even quite simple programs, can be very hard.
There really is no way to tell you whether it will be easy or hard without at least a fairly detailed high level description of the program. Even then, even if that high level description suggests that the problem could lend itself to threading reasonably easily, whether it does or not, will depend a lot upon how you have coded your program to solve that problem.
That request for further information is my attempt to elicit from the OP what the terrain looks like.
The main difference is that I try to balance the positive and the negative. The possibility that adapting his current solution to threading could be relatively painless, whilst acknowledging that it could also be so hard as to require a significant effort.
My response to your initial post appears to be where you have your main difference with what I've posted. In that post I tried to show that for some implementations of some problems, the conversion from single threaded to multi-threaded can actually be relatively painless. Indeed, as I went on to try and explain, that I believe that starting with a single-threaded implementation is often the best way to tackle the task.
To try and put that speculation into some context. I know just enough about Gene Folding to be dangerous. I know for example, that it is essentially an NP-hard, cpu-intensive, 3D puzzle. And I also know that these types of problems are often tackled using Genetic Programming techniques(*).
*(Note: There is no direct correspondence between 'Gene' and 'Genetic' in this context)
And naively, GP involves the following basis steps:
- running an identical, cpu-intensive, 'fitness algorithm' on a bunch (generation) of candidate solutions;
- then making a sub-selection of those based on their results;
- then applying some type(s) of mutations to the sub-selection;
- go back to step 1; rinse and repeat.
The two characteristics of this type of algorithm are that:
- The fitness algorithms are independent of each other, and very cpu-intensive.
This is ideal for multiprocessing with each core/cpu utilised acting as a near perfect divisor of the overall elapsed time. The independence of the calculations means that there is very little 'threading' overhead involved.
- The results from the fitness function need to be fed back and coalesced for the mixing (mutation) stage.
This makes threading preferable to forking/multi-processes, as the feedback loop can be done entirely through shared memory (arrays, queues or stacks), avoiding the need for the additional complexities of high fan-out/fan-in IPC.
And a single-threaded GP implementation might generically look something like this:
sub fitnessCalculation{ ... }
sub mixAndMutate{ ... }
my @data = readData();
my @generation = genRandomGeneration( 1000 );
while( 1 ) {
$_->score = fitnessCalculation( $_ ) for @generation;
@generation = ( sort @generation ) [ 0 .. 200 ];
last if $generation[ 0 ]->score > $targetScore;
push @generation, mixAndMutate( @generation );
}
report $generation[ 0 ];
Of course, that is s gross simplification. The terminating condition may incorporate a 'no improvement for N-generations' step. And that may be subject to a local minima/maxima avoidance strategy. And both the fitnessCalculation() and mixAndMatch() subroutines may be very complex with legions of variation.
But the fact remains that the cpu-intensive part, the former of those two subroutines, is data-parallel, and so is ripe for multi-tasking.
But the sorting & selection processes, and the mixAndMutate() steps, require that the results of the fitnessCalculation()s be gathered back into the same memory space. And that makes multi-threading preferable to multiprocessing for this task.
So, sticking with the gross simplification, I can sketch out a multi-threading architecture based upon the single-threaded version:
use threads;
use Thread::Queue;
sub fitnessCalculation{ ... }
sub mixAndMutate{ ... }
my $Qwork = new Thread::Queue;
my $Qresults = new Thread::Queue;
my @workers = map
threads->create( \&fitnessCalculation, $Qwork, $Qresults
), 1 .. $noOfCores;
my @data = readData();
my @generation = genRandomGeneration( 1000 );
while( 1 ) {
$Qwork->enqueue( @generation, (undef) x $noOfCores ); ## Updated.
@generation = ();
for ( 1 .. $noOfCores ) {
push @generation, $_ while $_ = $Qresults->dequeue;
}
@generation = ( sort @generation ) [ 0 .. 200 ];
last if $generation[ 0 ]->score > $targetScore;
push @generation, mixAndMutate( @generation );
}
report $generation[ 0 ];
Again, that's a gross simplification that omits details (like stopping the threads), but it demonstrates how with relatively minor changes, a single-threaded application might be converted to utilise threads to benefit from parallelism, without introducing much in the way of the scary deep voodoo of threading nightmares, traditionally unavoidable with other forms of threading.
Essentially, each of the major components remains a single linear flow. They may gain a bog standard loop or two, but for the most part, they remain devoid of multi-tasking considerations like locking and synchronisation. It also avoids all the complexities of orchestrating data-flows through asynchronously multiplexed, fan-out and fan-in, IPC.
Whether the OPs existing code would lend itself to such an architecture remains an open question--since the OP hasn't chosen to supply any further details.
But it is that last point that highlights the main difference between our positions. Could it be that the dire warnings in this thread:
I've seen many projects collapse from underestimating the work involved.
I would not advise starting into multi-threading with a task that you really care about.
But don't try it without 10+ years of education and training, at least not on your children.
Might these have put the OP off from even trying?
When there is at least the possibility that his code might have lent itself to a relatively easy conversion to the benefits of multi-tasking through threading--eg. the cutting of his runtimes to 1/2 +constant, 1/4 +constant, or 1/8 +constant, of their single-tasking equivalents.
That's why I get so disheartened when I see such negative responses to requests for information about threading. Especially as those dire warning usually emanate from those who (at best), have some (more or less) experience of using other threading systems; and at worst come from those who may have read a little literature, or heard a few horror stories--usually about other threading systems--but have no real experience of using iThreads at all.
Finally, the other part of my driving analogy that you missed (or glossed over), is that making a mistake whilst learning to drive can easily turn out to be fatal. Mistakes in programming rarely are.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
|