in reply to Re^4: Multithreading, how to?
in thread Multithreading, how to?

With code, once the right steps are codified and verified, the computer will happily perform those steps over and over precisely without getting bored, tired or distracted. The same does not hold true for human beings.

However, even in the single threaded case, code can appear to work for a long time (potentially years) and eventually fail because it was not coded correctly. While the code may not get bored or distracted, changes in its environment can make it fail intermittently. This is, of course independent of concurrency considerations.

If you share nothing, there can be--barring errors in the core implementation; which do crop up occasionally in this field just as they do in every other field--no accidental interactions.

Unless I misunderstand, though the separate threads still share the same file system, sockets, environment, etc. many of which are subject to cause interactions if the programmer is not careful. If I'm understanding the issue correctly, iThreads make memory non-shared by default. While the threads are more isolated this way, they are not completely independent.

The second major flaw in the multiple persons analogy is that with few exceptions, the success of human group activieties is dependant upon the management, coordination and timing of the interactions between the individuals. Hence the 'teacher' & 'conductor' roles in those analogies.

I obviously missed something in my explanation. I was not seeing the conductor or teacher as portions of the threaded code, the programmer is in the position of the 'teacher' or 'conductor' in these analogies.

Let me try another way.

In most single threaded programs, the programmer can lay out a sequence of steps with appropriate control logic. You can pretty much see how the code will run from beginning to end. (More complicated problems make this harder to see, but bear with me for a moment.)

In a multi-threaded program, on the other hand, the programmer builds smaller, independent sequences of steps that run under their own control. Any place that these threads need to interact is not under the direct control of the programmer. So the programmer is kind of one step removed from the code. This makes understanding the interactions and timing (for lack of a better word) more important in this style than in single threaded code.

It's been my experience that many people find this harder. Or they code assuming that things will behave the way they want (like it did in the single threaded model). These people are often stumped when their code doesn't work.

Obviously, the fewer the interactions between the threads (not accessing shared resources and such), the less the coordination issues will affect us. If there's no interaction between the threads, then I wonder why they are in the same program.

The most important of those lessons is that the easiest way to avoid the well-known problems of deadlocking, priority inversion et al. is to simply avoid using the mechanisms that lead to them. It sounds too simple to be true, but in the vast majority of the use cases I have explored, I've found relatively simple and reliable solutions that use iThreads, Queues, minimal shared data and minimal locking.

These are the lessons I also learned from multi-threading work, even without iThreads from the mid 90s. But, these approaches do not always come naturally when someone tries threads for the first time.

I guess that is what I've been trying to say (but have only done so badly). Multi-threading and single threading benefit from somewhat different viewpoints and tool sets. Someone doing threads for the first time should be aware that the different tools and viewpoints may be needed.

G. Wade

Replies are listed 'Best First'.
Re^6: Multithreading, how to?
by BrowserUk (Patriarch) on Jan 01, 2009 at 06:21 UTC

    I hope you don't object to me continuing this discussion? I find debate the best way for me to clarify my own views and opinions--and you debate well.

    That said, I'm going to be lazy here and summarise(*) your post above.

    (* I mean 'precisé', but I'm unsure of the correct spelling. I know that last grapheme is incorrect, but I don't know how to produce (what I think) is the correct one.)

    I'll summarise it as: "Threading is hard, especially for beginners. It will require you to think differently and use different tools. You may find it easier not to try."

    That is an unfair simplification of the position you have expressed in the preceding posts. But not by much.

    Programmers have an inherent bias that leads them to believe and express that what they do is hard. It makes sense. It is good for business, and good for salaries. But ostensibly, relative to other technical fields, it is entirely wrong. Or at least, overstated.

    Learning to drive is a radical departure from anything you are likely to have done before you do it. It doesn't just require hand-eye coordination, but rather hand-eye-foot-brain coordination. But more importantly, it requires anticipation of the actions and reactions of others. Whilst some other activities--sports; computer games etc--exhibit similar requirements, the difference is that mistakes when driving are not just sometimes lethal, but frequently so.

    By contrast, programming is essentially benign. Barring the extremes--nuclear power plants; weapon systems; medical equipment--programming errors are seldom lethal. And even in those extreme cases where they can be, they can in most cases be mitigated by proper and conscientious testing.

    The point I'm trying to make is that everyone has to start somewhere. And regardless of the complexities of the task undertaken by a programmer, there is always a learning curve involved. But unlike many other fields of endeavour, programmers have the inherent benefit of being able to test the effects of their skills, prior to putting them into potentially lethal practice.

    Sure, threads are different. Threading can be hard. But neither fact should be lauded as a reason to preclude anyone from trying to acquire the requisite skills.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      I don't mind. I've found this to be an interesting discussion as well. I appreciate the compliment, although most people would just say I'm stubborn.<grin/> You've already convinced me I need to spend more time with iThreads. I look forward to what else I'll learn.

      I am surprised by how badly I'm apparently expressing my position on threads. This is shown by your summary.

      I'll summarise it as: "Threading is hard, especially for beginners. It will require you to think differently and use different tools. You may find it easier not to try."

      I pretty much agree with everything but the last sentence. I don't think I said you shouldn't try anywhere. I just suggested that it was not a trivial exercise.

      Your driving description actually sets up a perfect example of what I'm trying to say. When someone first wants to drive, we do not hand them keys and point them to the freeway for them to figure it out. Way back when I learned to drive, we started in an empty parking lot, with very little to run into.

      In much the same way, I think it is important to understand that a person's first threading project should be a pilot or test project. Adding threads to an existing project can be akin to putting someone in the drivers seat on the shoulder of a freeway and telling them to drive home.

      To take your example a bit further, (I've never been one to stop stretching an analogy until it hurts.<grin/>) I would compare single-thread programming to walking and multi-threaded programming to driving. When driving, you are not focused on staying balanced and watching where you put your feet. You need to be aware of much more:

      • the boundaries of your vehicle
      • your speed
      • the relation of the movement of your feet and hands to the movement of the vehicle
      • the vehicles around you
      • the road surface

      Once someone becomes proficient at driving, much of this awareness becomes unconscious. (You don't have to think about moving about foot motions when you want to slow down.) To someone new to driving, this is a whole lot to keep up with.

      Also like the driving analogy, I would never suggest that someone not learn to drive, just because it is very different from walking. But, I wouldn't think its appropriate to tell someone to just try driving on a 100 mile trip on their first attempt. Driving and threading are both useful skills, that are appropriate when you need them. But, they both require learning, practice and a new set of skills.

      Without knowing the problem the OP is trying to solve, I couldn't recommend threads as the right solution. Any more than I would recommend driving without knowing where someone needs to go (and what the terrain looks like.)

      That comment was what I thought I was supporting in your first response.

      G. Wade
        Your driving description actually sets up a perfect example of what I'm trying to say. When someone first wants to drive, we do not hand them keys and point them to the freeway for them to figure it out.

        ...

        Without knowing the problem the OP is trying to solve, I couldn't recommend threads as the right solution. Any more than I would recommend driving without knowing where someone needs to go (and what the terrain looks like.)

        Once again, I would say that we are in near complete agreement. I wouldn't just hand a beginner the keys to the car either. So, stretching the analogy a little further, think of this place as a driving school!

        And if you'll excuse me for self-referencing, my initial response to the OP took exactly that tack:

        The answer to both questions is: it depends upon the program. Some programs lend themselves to multi-threading and the conversion can be quite simple. Others, even quite simple programs, can be very hard.

        There really is no way to tell you whether it will be easy or hard without at least a fairly detailed high level description of the program. Even then, even if that high level description suggests that the problem could lend itself to threading reasonably easily, whether it does or not, will depend a lot upon how you have coded your program to solve that problem.

        That request for further information is my attempt to elicit from the OP what the terrain looks like.

        The main difference is that I try to balance the positive and the negative. The possibility that adapting his current solution to threading could be relatively painless, whilst acknowledging that it could also be so hard as to require a significant effort.

        My response to your initial post appears to be where you have your main difference with what I've posted. In that post I tried to show that for some implementations of some problems, the conversion from single threaded to multi-threaded can actually be relatively painless. Indeed, as I went on to try and explain, that I believe that starting with a single-threaded implementation is often the best way to tackle the task.

        To try and put that speculation into some context. I know just enough about Gene Folding to be dangerous. I know for example, that it is essentially an NP-hard, cpu-intensive, 3D puzzle. And I also know that these types of problems are often tackled using Genetic Programming techniques(*).

        *(Note: There is no direct correspondence between 'Gene' and 'Genetic' in this context)

        And naively, GP involves the following basis steps:

        1. running an identical, cpu-intensive, 'fitness algorithm' on a bunch (generation) of candidate solutions;
        2. then making a sub-selection of those based on their results;
        3. then applying some type(s) of mutations to the sub-selection;
        4. go back to step 1; rinse and repeat.

        The two characteristics of this type of algorithm are that:

        1. The fitness algorithms are independent of each other, and very cpu-intensive.

          This is ideal for multiprocessing with each core/cpu utilised acting as a near perfect divisor of the overall elapsed time. The independence of the calculations means that there is very little 'threading' overhead involved.

        2. The results from the fitness function need to be fed back and coalesced for the mixing (mutation) stage.

          This makes threading preferable to forking/multi-processes, as the feedback loop can be done entirely through shared memory (arrays, queues or stacks), avoiding the need for the additional complexities of high fan-out/fan-in IPC.

        And a single-threaded GP implementation might generically look something like this:

        sub fitnessCalculation{ ... } sub mixAndMutate{ ... } my @data = readData(); my @generation = genRandomGeneration( 1000 ); while( 1 ) { $_->score = fitnessCalculation( $_ ) for @generation; @generation = ( sort @generation ) [ 0 .. 200 ]; last if $generation[ 0 ]->score > $targetScore; push @generation, mixAndMutate( @generation ); } report $generation[ 0 ];

        Of course, that is s gross simplification. The terminating condition may incorporate a 'no improvement for N-generations' step. And that may be subject to a local minima/maxima avoidance strategy. And both the fitnessCalculation() and mixAndMatch() subroutines may be very complex with legions of variation.

        But the fact remains that the cpu-intensive part, the former of those two subroutines, is data-parallel, and so is ripe for multi-tasking.

        But the sorting & selection processes, and the mixAndMutate() steps, require that the results of the fitnessCalculation()s be gathered back into the same memory space. And that makes multi-threading preferable to multiprocessing for this task.

        So, sticking with the gross simplification, I can sketch out a multi-threading architecture based upon the single-threaded version:

        use threads; use Thread::Queue; sub fitnessCalculation{ ... } sub mixAndMutate{ ... } my $Qwork = new Thread::Queue; my $Qresults = new Thread::Queue; my @workers = map threads->create( \&fitnessCalculation, $Qwork, $Qresults ), 1 .. $noOfCores; my @data = readData(); my @generation = genRandomGeneration( 1000 ); while( 1 ) { $Qwork->enqueue( @generation, (undef) x $noOfCores ); ## Updated. @generation = (); for ( 1 .. $noOfCores ) { push @generation, $_ while $_ = $Qresults->dequeue; } @generation = ( sort @generation ) [ 0 .. 200 ]; last if $generation[ 0 ]->score > $targetScore; push @generation, mixAndMutate( @generation ); } report $generation[ 0 ];

        Again, that's a gross simplification that omits details (like stopping the threads), but it demonstrates how with relatively minor changes, a single-threaded application might be converted to utilise threads to benefit from parallelism, without introducing much in the way of the scary deep voodoo of threading nightmares, traditionally unavoidable with other forms of threading.

        Essentially, each of the major components remains a single linear flow. They may gain a bog standard loop or two, but for the most part, they remain devoid of multi-tasking considerations like locking and synchronisation. It also avoids all the complexities of orchestrating data-flows through asynchronously multiplexed, fan-out and fan-in, IPC.

        Whether the OPs existing code would lend itself to such an architecture remains an open question--since the OP hasn't chosen to supply any further details.

        But it is that last point that highlights the main difference between our positions. Could it be that the dire warnings in this thread:

        • I've seen many projects collapse from underestimating the work involved.
        • I would not advise starting into multi-threading with a task that you really care about.
        • But don't try it without 10+ years of education and training, at least not on your children.

        Might these have put the OP off from even trying?

        When there is at least the possibility that his code might have lent itself to a relatively easy conversion to the benefits of multi-tasking through threading--eg. the cutting of his runtimes to 1/2 +constant, 1/4 +constant, or 1/8 +constant, of their single-tasking equivalents.

        That's why I get so disheartened when I see such negative responses to requests for information about threading. Especially as those dire warning usually emanate from those who (at best), have some (more or less) experience of using other threading systems; and at worst come from those who may have read a little literature, or heard a few horror stories--usually about other threading systems--but have no real experience of using iThreads at all.

        Finally, the other part of my driving analogy that you missed (or glossed over), is that making a mistake whilst learning to drive can easily turn out to be fatal. Mistakes in programming rarely are.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

      I mean 'precisé', but I'm unsure of the correct spelling

      First person future tense would be "préciserai", but does it makes much sense to use French verb conjugations in English? You can "write a précis", however.
      I can think of several problems with your driving analogy but I think they are rooted in two different things -- we have instincts related to movement/motion and we get real-time sensory feedback while driving.

      Meanwhile, for programming, beyond syntax highlighting there is no real-time sensory feedback. It's also not clear to me what instincts -- if any -- we have that help with programming. Puzzle solving comes to mind but that's is more like a trait than an instinct.

      Cooking may be a better analogy, but even it suffers from some real-time sensory feedback.

      This has been an interesting thread though. Thanks!

      Elda Taluta; Sarks Sark; Ark Arks

        I can think of several problems with your driving analogy but I think they are rooted in two different things -- we have instincts related to movement/motion and we get real-time sensory feedback while driving.

        No analogy is perfect. However, I think that if we squint our eyes a little and maybe stick our heads through a convenient opening in that proverbial cardboard carton--even if we're not prepared to climb all the way out--that there are some parallels even for those two things:

        1. Feedback.

          It may not be real-time per se, but I think warning and error messages are at least somewhat analogous to the feedback driving provides.

          That kind of ethereal tingling sensation one gets at the base of the spine, when the surface you are driving on has little grip. When you've been there and felt it a few times, you get to recognise it long before the tyres actually loose grip. And if your good enough--race and rally drivers for example--then you can use it not just to tell you slow down, but to temper the way you supply inputs to the controls. If you're really good, you can even antisipate the slide and apply corrections before they are indicated.

          The same thing goes for warnings and errors. It's not just the text of those messages tha tells you something. But the way they are issued--their number and frequency. If they are many and continuous, the problem is probably inside the loop. If they are less and stacatto in the occurance, you probably have a bounds error--maybe the infamous out-by-one on the loop control.

          There are others signs. The cpu fan steps up in frequency when you're not expecting it to. The drive light goes hard on. Even the notes and tune the drive motor plays as the program runs can tell you something is different. These things don't usually tell you what is wrong, but they tell you that something is wrong--and maybe even indicate where.

          Feedback is also the primary reason I love my REPL, simplistic as it is. There is almost aways a copy of it running on my system, and whenever I have any doubts about a line of code I'm about to write, I switch to it and try that line out in the REPL before coding it in the program.

          And when trying to debug an algorithm that produces large volumes of output, I'll often direct the output to the screen and shrink the font to a minimum--on my console that's an unreadable 5-point Lucida--and then just watch the output scroll for a while. Instead of concentrating upon the detail of the output, I look for patterns in the flow. I cannot tell you how many times I've spotted, if not the cause, the area of the code to look at for the bugs.

          These aren't things that you can read in any text book, nor learn from any lecture. They're intuition you acquire over time from the feedback the code gives you.

        2. Instincts

          A few years ago--shortly before coming to this site for the first time as it happens--I spent a couple of years in which I did virtually no coding. Indeed, for the first of those two I did none at all. And the main thing I notice when I started to get back into it, was not that I forgotten how to code, or write algorithms, or design my programs.

          It was that my instincts had become dulled.

          Everything just took longer, and was harder. From deciding what to name my subroutines, to ordering their arguments. How to structure my data; split my program across files; split my statements across lines. It wasn't that things took an inordinate amount of time--just longer. I had to think about it rather than just doing it

          And when the program failed, it was even worse. I would often sit staring at the code without any intuition about where to start looking. Worst of all, I'd find myself "shotgun debugging". Scattering debug throughout the program looking for clues. Altering statements to see what if any affect the changes had. Trying the same things multiple times in the hope, if not the expectation, that the outcome would be different.

          I'd say that it probably took 2 years or so to get my intuition back to where it had previously been. Maybe that was slowed down somewhat by moving to a new language--Perl--at the same time, but still it was an uncomfortable transition back.

          I think that intuition plays a considerable role in lifting a mediocre programmer into the realms of a reasonably good one. More importantly, perhaps the hardest working and most contentious programmer I ever knew, simple never seemed--in the time I worked with him--to develop any level of intuition for coding at all.

          For him, coding was simply a 9 to 5 activity predicated upon the need to put money in the bank. As I said, he was neither lazy nor callous about about it, but it was just a job that he did to the best of his ability--without seeming to take any great pleasure in it--until he took the step into management at the first opportunity. Something it turned out he excelled at.

        So, whilst the analogy may be imperfect, requiring a stretching of the mind to make the pieces fit, I don't think that it is necessarily a stretch too far.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.