I don't mind. I've found this to be an interesting discussion as well. I appreciate the compliment, although most people would just say I'm stubborn.<grin/> You've already convinced me I need to spend more time with iThreads. I look forward to what else I'll learn.
I am surprised by how badly I'm apparently expressing my position on threads. This is shown by your summary.
I'll summarise it as: "Threading is hard, especially for beginners. It will require you to think differently and use different tools. You may find it easier not to try."
I pretty much agree with everything but the last sentence. I don't think I said you shouldn't try anywhere. I just suggested that it was not a trivial exercise.
Your driving description actually sets up a perfect example of what I'm trying to say. When someone first wants to drive, we do not hand them keys and point them to the freeway for them to figure it out. Way back when I learned to drive, we started in an empty parking lot, with very little to run into.
In much the same way, I think it is important to understand that a person's first threading project should be a pilot or test project. Adding threads to an existing project can be akin to putting someone in the drivers seat on the shoulder of a freeway and telling them to drive home.
To take your example a bit further, (I've never been one to stop stretching an analogy until it hurts.<grin/>) I would compare single-thread programming to walking and multi-threaded programming to driving. When driving, you are not focused on staying balanced and watching where you put your feet. You need to be aware of much more:
- the boundaries of your vehicle
- your speed
- the relation of the movement of your feet and hands to the movement of the vehicle
- the vehicles around you
- the road surface
Once someone becomes proficient at driving, much of this awareness becomes unconscious. (You don't have to think about moving about foot motions when you want to slow down.) To someone new to driving, this is a whole lot to keep up with.
Also like the driving analogy, I would never suggest that someone not learn to drive, just because it is very different from walking. But, I wouldn't think its appropriate to tell someone to just try driving on a 100 mile trip on their first attempt. Driving and threading are both useful skills, that are appropriate when you need them. But, they both require learning, practice and a new set of skills.
Without knowing the problem the OP is trying to solve, I couldn't recommend threads as the right solution. Any more than I would recommend driving without knowing where someone needs to go (and what the terrain looks like.)
That comment was what I thought I was supporting in your first response.
| [reply] |
Your driving description actually sets up a perfect example of what I'm trying to say. When someone first wants to drive, we do not hand them keys and point them to the freeway for them to figure it out.
...
Without knowing the problem the OP is trying to solve, I couldn't recommend threads as the right solution. Any more than I would recommend driving without knowing where someone needs to go (and what the terrain looks like.)
Once again, I would say that we are in near complete agreement. I wouldn't just hand a beginner the keys to the car either. So, stretching the analogy a little further, think of this place as a driving school!
And if you'll excuse me for self-referencing, my initial response to the OP took exactly that tack:
The answer to both questions is: it depends upon the program. Some programs lend themselves to multi-threading and the conversion can be quite simple. Others, even quite simple programs, can be very hard.
There really is no way to tell you whether it will be easy or hard without at least a fairly detailed high level description of the program. Even then, even if that high level description suggests that the problem could lend itself to threading reasonably easily, whether it does or not, will depend a lot upon how you have coded your program to solve that problem.
That request for further information is my attempt to elicit from the OP what the terrain looks like.
The main difference is that I try to balance the positive and the negative. The possibility that adapting his current solution to threading could be relatively painless, whilst acknowledging that it could also be so hard as to require a significant effort.
My response to your initial post appears to be where you have your main difference with what I've posted. In that post I tried to show that for some implementations of some problems, the conversion from single threaded to multi-threaded can actually be relatively painless. Indeed, as I went on to try and explain, that I believe that starting with a single-threaded implementation is often the best way to tackle the task.
To try and put that speculation into some context. I know just enough about Gene Folding to be dangerous. I know for example, that it is essentially an NP-hard, cpu-intensive, 3D puzzle. And I also know that these types of problems are often tackled using Genetic Programming techniques(*).
*(Note: There is no direct correspondence between 'Gene' and 'Genetic' in this context)
And naively, GP involves the following basis steps:
- running an identical, cpu-intensive, 'fitness algorithm' on a bunch (generation) of candidate solutions;
- then making a sub-selection of those based on their results;
- then applying some type(s) of mutations to the sub-selection;
- go back to step 1; rinse and repeat.
The two characteristics of this type of algorithm are that:
- The fitness algorithms are independent of each other, and very cpu-intensive.
This is ideal for multiprocessing with each core/cpu utilised acting as a near perfect divisor of the overall elapsed time. The independence of the calculations means that there is very little 'threading' overhead involved.
- The results from the fitness function need to be fed back and coalesced for the mixing (mutation) stage.
This makes threading preferable to forking/multi-processes, as the feedback loop can be done entirely through shared memory (arrays, queues or stacks), avoiding the need for the additional complexities of high fan-out/fan-in IPC.
And a single-threaded GP implementation might generically look something like this:
sub fitnessCalculation{ ... }
sub mixAndMutate{ ... }
my @data = readData();
my @generation = genRandomGeneration( 1000 );
while( 1 ) {
$_->score = fitnessCalculation( $_ ) for @generation;
@generation = ( sort @generation ) [ 0 .. 200 ];
last if $generation[ 0 ]->score > $targetScore;
push @generation, mixAndMutate( @generation );
}
report $generation[ 0 ];
Of course, that is s gross simplification. The terminating condition may incorporate a 'no improvement for N-generations' step. And that may be subject to a local minima/maxima avoidance strategy. And both the fitnessCalculation() and mixAndMatch() subroutines may be very complex with legions of variation.
But the fact remains that the cpu-intensive part, the former of those two subroutines, is data-parallel, and so is ripe for multi-tasking.
But the sorting & selection processes, and the mixAndMutate() steps, require that the results of the fitnessCalculation()s be gathered back into the same memory space. And that makes multi-threading preferable to multiprocessing for this task.
So, sticking with the gross simplification, I can sketch out a multi-threading architecture based upon the single-threaded version:
use threads;
use Thread::Queue;
sub fitnessCalculation{ ... }
sub mixAndMutate{ ... }
my $Qwork = new Thread::Queue;
my $Qresults = new Thread::Queue;
my @workers = map
threads->create( \&fitnessCalculation, $Qwork, $Qresults
), 1 .. $noOfCores;
my @data = readData();
my @generation = genRandomGeneration( 1000 );
while( 1 ) {
$Qwork->enqueue( @generation, (undef) x $noOfCores ); ## Updated.
@generation = ();
for ( 1 .. $noOfCores ) {
push @generation, $_ while $_ = $Qresults->dequeue;
}
@generation = ( sort @generation ) [ 0 .. 200 ];
last if $generation[ 0 ]->score > $targetScore;
push @generation, mixAndMutate( @generation );
}
report $generation[ 0 ];
Again, that's a gross simplification that omits details (like stopping the threads), but it demonstrates how with relatively minor changes, a single-threaded application might be converted to utilise threads to benefit from parallelism, without introducing much in the way of the scary deep voodoo of threading nightmares, traditionally unavoidable with other forms of threading.
Essentially, each of the major components remains a single linear flow. They may gain a bog standard loop or two, but for the most part, they remain devoid of multi-tasking considerations like locking and synchronisation. It also avoids all the complexities of orchestrating data-flows through asynchronously multiplexed, fan-out and fan-in, IPC.
Whether the OPs existing code would lend itself to such an architecture remains an open question--since the OP hasn't chosen to supply any further details.
But it is that last point that highlights the main difference between our positions. Could it be that the dire warnings in this thread:
I've seen many projects collapse from underestimating the work involved.
I would not advise starting into multi-threading with a task that you really care about.
But don't try it without 10+ years of education and training, at least not on your children.
Might these have put the OP off from even trying?
When there is at least the possibility that his code might have lent itself to a relatively easy conversion to the benefits of multi-tasking through threading--eg. the cutting of his runtimes to 1/2 +constant, 1/4 +constant, or 1/8 +constant, of their single-tasking equivalents.
That's why I get so disheartened when I see such negative responses to requests for information about threading. Especially as those dire warning usually emanate from those who (at best), have some (more or less) experience of using other threading systems; and at worst come from those who may have read a little literature, or heard a few horror stories--usually about other threading systems--but have no real experience of using iThreads at all.
Finally, the other part of my driving analogy that you missed (or glossed over), is that making a mistake whilst learning to drive can easily turn out to be fatal. Mistakes in programming rarely are.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] [d/l] [select] |
I see that we are in agreement. My approach to the answer has more to do with my history than anything said in this thread. A fair number of painful projects have been the result of the following three step process:
- Customer or Boss: Can we solve problem X with technology Y?
- Me: Probably, if the problem is partitioned correctly and if Y works with our infrastructure, ...
- Customer or Boss: I told the CEO we could have technology Y in place next week because you said it is easy.
Since some people stop listening at yes or probably or whatever, I have gotten really paranoid and toss in the risks up front.
To be fair, I was responsible for the underestimating work comment, but not the other two. I intended the comment more to set expectations than to completely discourage.
As a result of my experience, I tend to focus more on the risks than the benefits. It's been my experience that programmers, in general, are extremely optimistic about what they are capable of doing. (Hubris, maybe?<grin/>) So, I don't worry about that part.
Finally, the other part of my driving analogy that you missed (or glossed over), is that making a mistake whilst learning to drive can easily turn out to be fatal. Mistakes in programming rarely are.
They may not be fatal to the programmer, but programming mistakes can definitely impact your company, or your paycheck in a negative way. I don't want to sound too dire about it, but at least your mistake wasn't fatal is no help if you are fired for a project that failed.
I'm not saying that using threads would get someone fired. I just feel that it's important to take off the rose-colored glasses before making (potentially) big changes to something you are working on.
If the problem can be help by threads and the programmer is aware of places where she should watch her step, I am all for threads.
As for the comment about inexperience with iThreads, I'm guilty as charged. However, my experience with several threading systems in different systems and languages, with multi-processing in multiple OSes, and home-built cooperative multi-tasking systems, leads me to believe that it is easy to underestimate the difference in approach.
I would never suggest that I know more about this topic than you do, but caution is sometimes warranted.
| [reply] |
Can you explain this part of your code? ...
$Qwork->( @generation, (undef) x $noOfCores );
Is a method name supposed to have followed the "->" ??
| [reply] [d/l] |
| [reply] |
I can think of several problems with your driving analogy but I think they are rooted in two different things -- we have instincts related to movement/motion and we get real-time sensory feedback while driving.
Meanwhile, for programming, beyond syntax highlighting there is no real-time sensory feedback. It's also not clear to me what instincts -- if any -- we have that help with programming. Puzzle solving comes to mind but that's is more like a trait than an instinct.
Cooking may be a better analogy, but even it suffers from some real-time sensory feedback.
This has been an interesting thread though. Thanks!
Elda Taluta; Sarks Sark; Ark Arks
| [reply] |
I can think of several problems with your driving analogy but I think they are rooted in two different things -- we have instincts related to movement/motion and we get real-time sensory feedback while driving.
No analogy is perfect. However, I think that if we squint our eyes a little and maybe stick our heads through a convenient opening in that proverbial cardboard carton--even if we're not prepared to climb all the way out--that there are some parallels even for those two things:
- Feedback.
It may not be real-time per se, but I think warning and error messages are at least somewhat analogous to the feedback driving provides.
That kind of ethereal tingling sensation one gets at the base of the spine, when the surface you are driving on has little grip. When you've been there and felt it a few times, you get to recognise it long before the tyres actually loose grip. And if your good enough--race and rally drivers for example--then you can use it not just to tell you slow down, but to temper the way you supply inputs to the controls. If you're really good, you can even antisipate the slide and apply corrections before they are indicated.
The same thing goes for warnings and errors. It's not just the text of those messages tha tells you something. But the way they are issued--their number and frequency. If they are many and continuous, the problem is probably inside the loop. If they are less and stacatto in the occurance, you probably have a bounds error--maybe the infamous out-by-one on the loop control.
There are others signs. The cpu fan steps up in frequency when you're not expecting it to. The drive light goes hard on. Even the notes and tune the drive motor plays as the program runs can tell you something is different. These things don't usually tell you what is wrong, but they tell you that something is wrong--and maybe even indicate where.
Feedback is also the primary reason I love my REPL, simplistic as it is. There is almost aways a copy of it running on my system, and whenever I have any doubts about a line of code I'm about to write, I switch to it and try that line out in the REPL before coding it in the program.
And when trying to debug an algorithm that produces large volumes of output, I'll often direct the output to the screen and shrink the font to a minimum--on my console that's an unreadable 5-point Lucida--and then just watch the output scroll for a while. Instead of concentrating upon the detail of the output, I look for patterns in the flow. I cannot tell you how many times I've spotted, if not the cause, the area of the code to look at for the bugs.
These aren't things that you can read in any text book, nor learn from any lecture. They're intuition you acquire over time from the feedback the code gives you.
- Instincts
A few years ago--shortly before coming to this site for the first time as it happens--I spent a couple of years in which I did virtually no coding. Indeed, for the first of those two I did none at all. And the main thing I notice when I started to get back into it, was not that I forgotten how to code, or write algorithms, or design my programs.
It was that my instincts had become dulled.
Everything just took longer, and was harder. From deciding what to name my subroutines, to ordering their arguments. How to structure my data; split my program across files; split my statements across lines. It wasn't that things took an inordinate amount of time--just longer. I had to think about it rather than just doing it
And when the program failed, it was even worse. I would often sit staring at the code without any intuition about where to start looking. Worst of all, I'd find myself "shotgun debugging". Scattering debug throughout the program looking for clues. Altering statements to see what if any affect the changes had. Trying the same things multiple times in the hope, if not the expectation, that the outcome would be different.
I'd say that it probably took 2 years or so to get my intuition back to where it had previously been. Maybe that was slowed down somewhat by moving to a new language--Perl--at the same time, but still it was an uncomfortable transition back.
I think that intuition plays a considerable role in lifting a mediocre programmer into the realms of a reasonably good one. More importantly, perhaps the hardest working and most contentious programmer I ever knew, simple never seemed--in the time I worked with him--to develop any level of intuition for coding at all.
For him, coding was simply a 9 to 5 activity predicated upon the need to put money in the bank. As I said, he was neither lazy nor callous about about it, but it was just a job that he did to the best of his ability--without seeming to take any great pleasure in it--until he took the step into management at the first opportunity. Something it turned out he excelled at.
So, whilst the analogy may be imperfect, requiring a stretching of the mind to make the pieces fit, I don't think that it is necessarily a stretch too far.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
| [reply] |