Re^7: Multithreading, how to?

I don't mind. I've found this to be an interesting discussion as well. I appreciate the compliment, although most people would just say I'm stubborn.<grin/> You've already convinced me I need to spend more time with iThreads. I look forward to what else I'll learn.

I am surprised by how badly I'm apparently expressing my position on threads. This is shown by your summary.

I'll summarise it as: "Threading is hard, especially for beginners. It will require you to think differently and use different tools. You may find it easier not to try."

I pretty much agree with everything but the last sentence. I don't think I said you shouldn't try anywhere. I just suggested that it was not a trivial exercise.

Your driving description actually sets up a perfect example of what I'm trying to say. When someone first wants to drive, we do not hand them keys and point them to the freeway for them to figure it out. Way back when I learned to drive, we started in an empty parking lot, with very little to run into.

In much the same way, I think it is important to understand that a person's first threading project should be a pilot or test project. Adding threads to an existing project can be akin to putting someone in the drivers seat on the shoulder of a freeway and telling them to drive home.

To take your example a bit further, (I've never been one to stop stretching an analogy until it hurts.<grin/>) I would compare single-thread programming to walking and multi-threaded programming to driving. When driving, you are not focused on staying balanced and watching where you put your feet. You need to be aware of much more:

the boundaries of your vehicle
your speed
the relation of the movement of your feet and hands to the movement of the vehicle
the vehicles around you
the road surface

Once someone becomes proficient at driving, much of this awareness becomes unconscious. (You don't have to think about moving about foot motions when you want to slow down.) To someone new to driving, this is a whole lot to keep up with.

Also like the driving analogy, I would never suggest that someone not learn to drive, just because it is very different from walking. But, I wouldn't think its appropriate to tell someone to just try driving on a 100 mile trip on their first attempt. Driving and threading are both useful skills, that are appropriate when you need them. But, they both require learning, practice and a new set of skills.

Without knowing the problem the OP is trying to solve, I couldn't recommend threads as the right solution. Any more than I would recommend driving without knowing where someone needs to go (and what the terrain looks like.)

That comment was what I thought I was supporting in your first response.

G. Wade

Comment on Re^7: Multithreading, how to?

Replies are listed 'Best First'.
Re^8: Multithreading, how to? by BrowserUk (Patriarch) on Jan 01, 2009 at 20:46 UTC
Your driving description actually sets up a perfect example of what I'm trying to say. When someone first wants to drive, we do not hand them keys and point them to the freeway for them to figure it out. ... Without knowing the problem the OP is trying to solve, I couldn't recommend threads as the right solution. Any more than I would recommend driving without knowing where someone needs to go (and what the terrain looks like.) Once again, I would say that we are in near complete agreement. I wouldn't just hand a beginner the keys to the car either. So, stretching the analogy a little further, think of this place as a driving school! And if you'll excuse me for self-referencing, my initial response to the OP took exactly that tack: The answer to both questions is: it depends upon the program. Some programs lend themselves to multi-threading and the conversion can be quite simple. Others, even quite simple programs, can be very hard. There really is no way to tell you whether it will be easy or hard without at least a fairly detailed high level description of the program. Even then, even if that high level description suggests that the problem could lend itself to threading reasonably easily, whether it does or not, will depend a lot upon how you have coded your program to solve that problem. That request for further information is my attempt to elicit from the OP what the terrain looks like. The main difference is that I try to balance the positive and the negative. The possibility that adapting his current solution to threading could be relatively painless, whilst acknowledging that it could also be so hard as to require a significant effort. My response to your initial post appears to be where you have your main difference with what I've posted. In that post I tried to show that for some implementations of some problems, the conversion from single threaded to multi-threaded can actually be relatively painless. Indeed, as I went on to try and explain, that I believe that starting with a single-threaded implementation is often the best way to tackle the task. To try and put that speculation into some context. I know just enough about Gene Folding to be dangerous. I know for example, that it is essentially an NP-hard, cpu-intensive, 3D puzzle. And I also know that these types of problems are often tackled using Genetic Programming techniques(). (Note: There is no direct correspondence between 'Gene' and 'Genetic' in this context) And naively, GP involves the following basis steps: running an identical, cpu-intensive, 'fitness algorithm' on a bunch (generation) of candidate solutions; then making a sub-selection of those based on their results; then applying some type(s) of mutations to the sub-selection; go back to step 1; rinse and repeat. The two characteristics of this type of algorithm are that: The fitness algorithms are independent of each other, and very cpu-intensive. This is ideal for multiprocessing with each core/cpu utilised acting as a near perfect divisor of the overall elapsed time. The independence of the calculations means that there is very little 'threading' overhead involved. The results from the fitness function need to be fed back and coalesced for the mixing (mutation) stage. This makes threading preferable to forking/multi-processes, as the feedback loop can be done entirely through shared memory (arrays, queues or stacks), avoiding the need for the additional complexities of high fan-out/fan-in IPC. And a single-threaded GP implementation might generically look something like this: `sub fitnessCalculation{ ... } sub mixAndMutate{ ... } my @data = readData(); my @generation = genRandomGeneration( 1000 ); while( 1 ) { $_->score = fitnessCalculation( $_ ) for @generation; @generation = ( sort @generation ) [ 0 .. 200 ]; last if $generation[ 0 ]->score > $targetScore; push @generation, mixAndMutate( @generation ); } report $generation[ 0 ];` [download] Of course, that is s gross simplification. The terminating condition may incorporate a 'no improvement for N-generations' step. And that may be subject to a local minima/maxima avoidance strategy. And both the `fitnessCalculation()` and `mixAndMatch()` subroutines may be very complex with legions of variation. But the fact remains that the cpu-intensive part, the former of those two subroutines, is data-parallel, and so is ripe for multi-tasking. But the sorting & selection processes, and the `mixAndMutate()` steps, require that the results of the `fitnessCalculation()`s be gathered back into the same memory space. And that makes multi-threading preferable to multiprocessing for this task. So, sticking with the gross simplification, I can sketch out a multi-threading architecture based upon the single-threaded version: use threads; use Thread::Queue; sub fitnessCalculation{ ... } sub mixAndMutate{ ... } my $Qwork = new Thread::Queue; my $Qresults = new Thread::Queue; my @workers = map threads->create( \&fitnessCalculation, $Qwork, $Qresults ), 1 .. $noOfCores; my @data = readData(); my @generation = genRandomGeneration( 1000 ); while( 1 ) { $Qwork->enqueue( @generation, (undef) x $noOfCores ); ## Updated. @generation = (); for ( 1 .. $noOfCores ) { push @generation, $_ while $_ = $Qresults->dequeue; } @generation = ( sort @generation ) [ 0 .. 200 ]; last if $generation[ 0 ]->score > $targetScore; push @generation, mixAndMutate( @generation ); } report $generation[ 0 ]; [download] Again, that's a gross simplification that omits details (like stopping the threads), but it demonstrates how with relatively minor changes, a single-threaded application might be converted to utilise threads to benefit from parallelism, without introducing much in the way of the scary deep voodoo of threading nightmares, traditionally unavoidable with other forms of threading. Essentially, each of the major components remains a single linear flow. They may gain a bog standard loop or two, but for the most part, they remain devoid of multi-tasking considerations like locking and synchronisation. It also avoids all the complexities of orchestrating data-flows through asynchronously multiplexed, fan-out and fan-in, IPC. Whether the OPs existing code would lend itself to such an architecture remains an open question--since the OP hasn't chosen to supply any further details. But it is that last point that highlights the main difference between our positions. Could it be that the dire warnings in this thread: I've seen many projects collapse from underestimating the work involved. I would not advise starting into multi-threading with a task that you really care about. But don't try it without 10+ years of education and training, at least not on your children. Might these have put the OP off from even trying? When there is at least the possibility that his code might have lent itself to a relatively easy conversion to the benefits of multi-tasking through threading--eg. the cutting of his runtimes to 1/2 +constant, 1/4 +constant, or 1/8 +constant, of their single-tasking equivalents. That's why I get so disheartened when I see such negative responses to requests for information about threading. Especially as those dire warning usually emanate from those who (at best), have some (more or less) experience of using other threading systems; and at worst come from those who may have read a little literature, or heard a few horror stories--usually about other threading systems--but have no real experience of using iThreads at all. Finally, the other part of my driving analogy that you missed (or glossed over), is that making a mistake whilst learning to drive can easily turn out to be fatal. Mistakes in programming rarely are. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. "Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."	[reply] [d/l] [select]
Re^9: Multithreading, how to? by gwadej (Chaplain) on Jan 01, 2009 at 21:19 UTC
I see that we are in agreement. My approach to the answer has more to do with my history than anything said in this thread. A fair number of painful projects have been the result of the following three step process: Customer or Boss: Can we solve problem X with technology Y? Me: Probably, if the problem is partitioned correctly and if Y works with our infrastructure, ... Customer or Boss: I told the CEO we could have technology Y in place next week because you said it is easy. Since some people stop listening at yes or probably or whatever, I have gotten really paranoid and toss in the risks up front. To be fair, I was responsible for the underestimating work comment, but not the other two. I intended the comment more to set expectations than to completely discourage. As a result of my experience, I tend to focus more on the risks than the benefits. It's been my experience that programmers, in general, are extremely optimistic about what they are capable of doing. (Hubris, maybe?<grin/>) So, I don't worry about that part. Finally, the other part of my driving analogy that you missed (or glossed over), is that making a mistake whilst learning to drive can easily turn out to be fatal. Mistakes in programming rarely are. They may not be fatal to the programmer, but programming mistakes can definitely impact your company, or your paycheck in a negative way. I don't want to sound too dire about it, but at least your mistake wasn't fatal is no help if you are fired for a project that failed. I'm not saying that using threads would get someone fired. I just feel that it's important to take off the rose-colored glasses before making (potentially) big changes to something you are working on. If the problem can be help by threads and the programmer is aware of places where she should watch her step, I am all for threads. As for the comment about inexperience with iThreads, I'm guilty as charged. However, my experience with several threading systems in different systems and languages, with multi-processing in multiple OSes, and home-built cooperative multi-tasking systems, leads me to believe that it is easy to underestimate the difference in approach. I would never suggest that I know more about this topic than you do, but caution is sometimes warranted. G. Wade	[reply]
Re^9: Multithreading, how to? by Tommy (Chaplain) on Jan 03, 2009 at 21:04 UTC
Can you explain this part of your code? ... `$Qwork->( @generation, (undef) x $noOfCores );` [download] Is a method name supposed to have followed the "->" ?? -- Tommy	[reply] [d/l]
Re^10: Multithreading, how to? by BrowserUk (Patriarch) on Jan 03, 2009 at 21:51 UTC
shmem's right. I omitted the method `enqueue`, now corrected. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. "Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."	[reply] [d/l] [select]
Re^10: Multithreading, how to? by shmem (Chancellor) on Jan 03, 2009 at 21:32 UTC
No. The notation `$variable->(@args)` means de-referencing, i.e. invocation of a subroutine reference (or "anonymous subroutine"). Method invocation for `$object` is generally written as `$object->method(@args)` (unless the content of the scalar `$object` is a subroutine reference, which is rarely used). So I guess, BrowserUk really meant `$Qwork->enqueue( @generation, (undef) x $noOfCores );` [download] or such, but that's just guessing.	[reply] [d/l] [select]