http://qs1969.pair.com?node_id=1094201

This started life as a reply to Re^2: Which 'Perl6'? (And where?), but it seems too important to bury it down there in a long dead thread as a reply to an author I promised to resist, and whom probably will not respond. So I'm putting it here to see what of any interest it arouses.


  1. Is concurrency appropriate? There are two basic motivations ... and 2) to speed things up. In the latter case, if the problem being tackled is really IO bound, turning to concurrency probably won't help.

    That is way too simplistic a view. If the problem is IO bound to a single, local, harddisk, and is uncacheable, then concurrency may not help.

    But change any of the four defining elements of that criteria; and it might -- even: probably will -- be helped by well written asynchronicity. Eg.

    1. If the IO data is, or can be, spread across multiple local physical drives; concurrency can speed overall throughput by overlapping requests to different units.
    2. If the disks are remote -- as in SAN, NAS, cloud etc. -- then again, overlapping requests can increase throughput by utilising buffering and waiting time for processing.
    3. If the drives aren't harddisks, but SSDs; or SSD buffered HDs; or PCI connected virtual drives; then overlapping several fast read requests with each slower write request can more fully utilise the available bandwidth and improve throughput.
    4. If the IO involved displays temporal locality of reference -- that is, if the nature of the processing is such that a subset of the data has multiple references over a short period of time, even if that subset changes over the longer term -- then suspending the IO for new references until re-references to existing cached data play out comes about naturally if fine-grained concurrency is used.

    And if some or all of the IO in your IO bound processing is to the network, or network attached devices; or the intranet; or the internet; or the cloud; -- eg. webserving; webcrawling; webscraping; collaborative datasets; email; SMS; customer facing; ....... -- then both:

    • Preventing IO from freezing your processing;
    • And allowing threads of execution who's IO has completed to continue as soon as a core is available -- ie. not also have to wait for any particular core to become available;

    Is mandatory for effective utilisation of modern hardware and networks; even for IO-bound processing.

    Only kernel(OS) threading provides the required combination of facilities. Cooperative multitasking (aka. 'green threads'; aka. Win95 tech) simply does not scale beyond the single core/single thread hardware of the last century.

  2. The Problem with Threads.

    The problem with "The Problem with Threads", is that it is just so much academic hot air divorced from the realities of the real world.

    Only mathematicians and computer scientists demand total determinacy; and throw their arms up in refusal to work if they don't get it.

    The rest of the world -- you, me, mothers and toddlers, doctors, lawyers, spacemen, dustmen, pilots, builders, shippers, movers & shakers, factory workers, engineers, tinkers, tailors, soldiers, sailors, rich & poor men, beggars and thieves; all have to live in the real -- asynchronous -- world, where shit happens.

    Deliveries are late; machines break down; people are sick; power-outs and system-downs occur; the inconvenient realities of life have to be accepted, lived with and dealt with.

    The problem is not that threading is hard; the problem is that people keep on saying that "threading is hard"; and then stopping there.

    Man is very adept at dealing with hard and complex tasks

    Imagine all places you'd never have been; all the things you'd never have done; if the once wide-spread belief that we would suffocate if we attempted to travel at over 30mph.

    Too trivial an example for you? Ok. Think about heart transplantation. Think about the problems of disconnecting and reconnecting the (fragile, living) large bore pipes supplying and removing the pumped liquid; the wires carrying electrical control signals; the small bore pipes carrying the lubricants needed to keep the pump alive and removing the waste. Now think about the complexities of doing a pump change whilst keeping the engine running; the passengers comfortable and the 'life force' intact. And all the while contending with all the other problems of compatibility; rejection; infection; compounded diagnosis.

    Circa. 5000 coronary transplants occurred last year. Mankind is good at doing difficult things.

    Asynchronicity and non-determinism are 'solved problems' in almost every other walk of life

    From multiple checkouts in supermarkets; to holding patterns in the skies above airport hubs; to off & on ramps on motorways; to holding tanks in petro-chemical plants; to waiting areas in airports and doctors and dentists surgeries; to carousels in baggage claims and production lines; distribution warehouses in supply chains; roundabouts and filter-in-turn; {Add the first 10 things that spring to your mind here! }.

    One day in the near future a non-indoctrinated mathematician is going to invent a symbol for an asynchronous queue.

    She'll give it a nice, technical sounding name like "Temporally Lax Composer", which will quickly become lost behind the cute acronym and new era of deterministic, asynchronous composability will ensue.

    And the academic world will rejoice, proclaim her a genius of our time, and no doubt award her a Nobel prize. (That'd be nice!)

    And suddenly the mathematicians will realise that a process or system of processes can be deterministic, without the requirement for every stage of the process (equation) to occur in temporal lockstep.

    'Safety' is the laudable imperative of the modern era.

    As in code-safety and thread-safety, but also every other kind of predictable, potentially preventable danger.

    Like piety, chastity & sobriety from bygone eras, it is hard to argue against; but the world is full (and getting fuller) of sexually promiscuous atheists who enjoy a drink; that hold down jobs, raise kids and perform charitable works. The world didn't fall apart with the wane of the religious, moral and sobriety campaigns of the past.

    In an ideal world, all corners would be rounded; flat surfaces 'soft-touch'; voltages would be low; gases non-toxic; hot water wouldn't scald; radiant elements wouldn't sear; microwaves would be confined to lead-lined bunkers; there'd be no naked flames; and every home would be fire-proof, flood-proof, hurricane-proof, tornado-proof, earthquake-proof, tsunami-proof and pestilence-proof.

    Meanwhile in the real-world, walk around your own house and see all the dangers that lurk for the unsupervised, uneducated, unwary, careless or stupid and ask yourself why do they persist? Practicality and economics.

    Theoreticians love theoretical problems; and eschew practical solutions.

    When considering concurrency, mathematicians love to invent ever so slightly more (theoretically) efficient solutions to the 'classical' problems.

    Eg. The Dining Philosophers. In a nutshell: how can 6 fil..Phillo.. guys eat their dinners using 5 forks without one or more of them starving. They'll combine locks and syncs, barriers and signals, mutexs and spinlocks and semaphores trying to claw back some tiny percentage of a quasilinear factor.

    Why? Buy another bloody fork; or use a spoon; or eat with your damn fingers.

    The problem is said to represent the situation where you have 6 computers that need to concurrently use the scarce resource of 5 tape machines. But that's dumb!

    Its not a resource problem but a capital expenditure problem. Buy another damn tape machine and save yourself 10 times its cost by avoiding having to code and maintain a complex solution. Better still, buy two extra tape machines; cos as sure as eggs is eggs, it'll be the year-end accounting run; or the Black Friday consumer spending peak when one of those tape machines defy the 3 sigma MTBF and break.

    Threading can be complex, but there are solutions to all of the problems all around us in the every day, unreliable, non-deterministic operations of every day modern life.

    And the simplest solution to many of them is to avoid creating problems in the first place. Don't synchronise (unless you absolutely have to). Don't lock (unless it is absolutely unavoidable). Don't share (unless avoiding doing so creates greater problems).

    But equally, don't throw the baby out with the bath water. Flames are dangerous; but oh so very useful.

  3. Futures et al are the future. There are much simpler, safer, higher level ways to do concurrency. I haven't tried Paul Evans' Futures, but they look the part.

    And therein lies the very crux of the problem. Most of those decrying threads; and those offering alternative to them; either haven't tried them -- because they read they were hard -- or did try them on the wrong problems, and/or using the wrong techniques; and without taking the time to become familiar with and understand their requirements and limitations.

    Futures neither remove the complexity nor solve the problems; they just bury them under the covers forcing everyone to rely upon the efficacy of their implementation and the competence of the implementors.

    And the people making the decisions are taking advice from those thread-shy novices with silver bullets and employing those with proven track records of being completely useless at implementing threaded solutions.

    The blind taking advice from the dumb and employing the incompetent.

  4. Perl 5 "threads" are very heavy. This sometimes introduces additional complexity.

    The "heaviness" of P5 threading is a misnomer. The threads aren't heavy; the implementation of shared memory is heavy. And that could easily be fixed. If there was any interest. If there wasn't an institutionalised prejudicial barrier preventing anyone even suggesting change to improve the threading support; much less supporting those with the knowledge and ideas to take them forward.

    They've basically stagnated for the past 8 or more years because p5p won't allow change.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
  • Comment on The problem with "The Problem with Threads"

Replies are listed 'Best First'.
Re: The problem with "The Problem with Threads"
by Corion (Patriarch) on Jul 18, 2014 at 11:58 UTC

    Thanks for writing your thoughts about threads. I use threads, and I like them a bit more than most of the other ways to introduce parallelism and concurrency. I only have one quibble with your points:

    The "heaviness" of P5 threading is a misnomer. The threads aren't heavy; the implementation of shared memory is heavy. And that could easily be fixed.

    As the current implementation of threads tries to simulate parallelism within one interpreter, "fixing" that is not easy as long as you want the promise of implicitly shared things (like the namespace and tied variables most importantly) to remain there. If you move to a less implicit model of sharing, like your usual approach using a queue, and also remove the promise of changes being visible to every thread, or convert it to a threat/warning that changes become visible everywhere, without protection, then I can concur that threads could become less heavy. But as they are now, they are heavy and making the Perl environment look to several threads as if they were the only thread currently running makes them remain heavy.

      As the current implementation of threads tries to simulate parallelism within one interpreter,

      After 24 hrs of thinking about that statement, I cannot make it gel with my knowledge that each thread starts a new interpreter.

      Given it is you, I assume I'm just misunderstanding your drift?


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        No, it was me who is simply wrong. Actually testing reveals that each thread really has its own namespace, like you say. So I'm less sure of where the actual problem or overhead lies with sharing data across threads, at least where the namespace is involved.

Re: The problem with "The Problem with Threads"
by zentara (Archbishop) on Jul 18, 2014 at 12:57 UTC
    Use the Fork, Luke, the Fork. :-)

    Oh wait, everything on Windows OS is a thread, even the forks. If I had to write a serious program, that could involve crashes or whatever, I would use fork. On Linux, SysV Shared Memory along with a fork, is the best way to go. It just seems to have a cumbersome interface now, it would be nice to have a super-easy SysV shared memory module. I know modules exist, but they are not as easy to use as threads::shared.


    I'm not really a human, but I play one on earth.
    Old Perl Programmer Haiku ................... flash japh
      Use the Fork, Luke, the Fork. :-)

      Try using a fork -- the metal utensil -- to slice a tomato and see the sticky mess you end up with.

      Like its namesake, fork is useful for somethings and not so for others.

      everything on Windows OS is a thread, even the forks.
      Everything on modern *nix is also a thread. Processes are simply single threaded processes.

      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        What are you advocating then, in regards to all this? Are you lobbying for forks and threads that don't use clone(), so you get a pure executable space, unencumbered by the parent process's ENV and filehandles? I think that would be the most desirable way to do it, but you would be forced to deal with the filehandles yourself, having to open them all manually. Most people don't expect to have to manually open stdout or stdin.

        I'm not really a human, but I play one on earth.
        Old Perl Programmer Haiku ................... flash japh
Re: The problem with "The Problem with Threads"
by raiph (Deacon) on Jul 31, 2014 at 17:49 UTC
    If the problem is IO bound to a single, local, harddisk, and is uncacheable, then concurrency may not help.

    Yes. I appreciated your detailed response.

    (Fwiw, in my original comment I meant that a problem that is as you describe is really IO bound, and a problem that isn't, isn't.)

    Only mathematicians and computer scientists demand total determinacy; and throw their arms up in refusal to work if they don't get it.

    I think you've misinterpreted The Problem with Threads. The author was not demanding total determinacy (which I agree would be profoundly bogus). To quote from the abstract: "Nondeterminism should be explicitly and judiciously introduced where needed, rather than removed where not needed."

    don't throw the baby out with the bath water. Flames are dangerous; but oh so very useful.

    While your particular choice of metaphors has me imagining a complicated maneuver involving strong folk, an old fashioned bath, a gassy baby, a lighter, and some sort of worthwhile singeing operation, I don't think your point is that far adrift from (Perl) "should not hide the existence of OS-level threads, or fail to provide access to lower level concurrency control constructs".

    Futures neither remove the complexity nor solve the problems; they just bury them under the covers forcing everyone to rely upon the efficacy of their implementation and the competence of the implementors.

    I confess to being amazed by this statement. Are you really saying that, when solving any concurrent problem correctly (data-wise), Futures are never (or seldom) simpler for the programmers writing, reading, and modifying the code than directly using the underlying low level concurrency constructs (which I'll hand wavily define as threads, locks, cas, etc.)?

    Do you think there are any very useful high level concurrency constructs?

    The "heaviness" of P5 threading is a misnomer. The threads aren't heavy; the implementation of shared memory is heavy. And that could easily be fixed. ... They've basically stagnated for the past 8 or more years because p5p won't allow change.

    I'd appreciate a link to a decent recent technical discussion of the problem of the heaviness of shared memory in P5 and reasonable potential fixes. Failing that, a couple paragraphs that summarize the problem and the fix you're suggesting would be great. Or maybe you could write a meditation that invites monks to focus on the technical issues related to the problem and potential fixes?

      • (Fwiw, in my original comment I meant that a problem that is as you describe is really IO bound

        (I had some trouble parsing that sentence; I hope i got you right.)

        The problems I describe where concurrency does help IO bound problems, are "really IO bound".

        1. With multiple physical drives:

          A simple filter -- read modify write (eg. grep) -- can be reading the next record from the source whilst the previous write to the destination completes. Runtime can be more than halved.

        2. Remote drives:

          By overlapping the latencies in the reads and writes; throughput can be increased substantially.

        3. SSDs:

          The write time for SSDs is usually substantially greater than the read time. For tasks that write less records than are written -- eg. line filters etc. -- overlapping the two can increase throughput.

        4. Temporal locality:

          Ditto.

        The point is that your statement implies that all IO bound processes cannot benefit from threading; where the truth is that only a few types cannot realise any benefit.

      • I think you've misinterpreted The Problem with Threads. "Nondeterminism should be explicitly and judiciously introduced where needed, rather than removed where not needed."'

        Actually, no I haven't. And your quote proves it.

        The single biggest source of both bugs & design fails, and unrealised performance, is synchronisation and locking. Ie. Attempts to impose determinism.

        People insist on trying to enforce the determinism that when this thread has finished with this piece of data; that thread starts processing it. They'll use semaphores, or locks, or signals, or mutexes, and often some combination of them. And with them come all the nasties wrongly attributed to "threading"; dead-locks, live-locks, and priority inversions; often transient, and always a nightmare to diagnose and fix.

        And even when they get it right, all they've done is create a lock-stepped, and thus sequential process. Almost, if not exactly, as if no threading had been done.

        Except of course, they've carefully orchestrated that there be at least one, and often multiple, context swaps occurring between steps. And then they wonder why it's slowed everything down. By attempting to deterministically control the flow of data between threads, you're working against the system scheduler and guaranteeing that no thread ever uses its full timeslice. Indeed, many times you're actually forcing the scheduler to load a thread only to immediately remove it again because the other thread that you're trying to pass control to hasn't yet run.

        The archetypal example of this is the Global Interpreter Lock as seen in Python, Ruby et al.

        Think of this like a hub airport would be if there was no buffer space -- aprons and gates. Planes would have to circle until one of the connecting flights arrived, then both land, exchange passengers; then both take off and circle again until another connecting flight was available. A stupid, but surprisingly accurate analogy.

        And people forget -- and that includes academics who tend to test and benchmark their algorithms in isolation of other workloads -- that the scheduler isn't just scheduling the threads of their process, but also the threads of every other process in the system. If you fail to utilise the full timeslice, the remainder doesn't automatically get transferred to another thread of your process, but most times gets lost entirely because (the thread of) another process, and often many other processes, get scheduled before your process gets another chance. And even when it (your process) does get another slot, it may be the wrong thread, or even the same thread that just abandoned its last one.

        To make best use of threading, you need to embrace non-determinism, by eschewing locking and synchronisation and the conditional checks associated with them -- unless it is absolutely required, which with appropriately consider algorithms, is very rare -- and allow things to free run.

        By providing low-overhead, flexible sized buffers (queues) between sequential sections of processing, you allow the well-tried and proven system scheduler with its multiple classes, dynamic priorities, dynamic timeslices et al. to give timeslices to threads that need them, in a timely manner.

        Imagine the complexity of locking required by a shared hash with multiple reading and writing threads performing concurrent access. And imagine trying to test it; and then debug it.

        Then, if you are really interested, go off and watch Cliff Click describe his lock-free hash table. Watch it all. See how he scales a hash table to be used concurrently from thousands of threads. See also the discussion of the bottlenecks with the standard Java ConcurrentHashMap.

        Bottom line: Everything in that paper (The Problem with Threads) tackles the "problem" from the completely wrong direction. Had it not been dated 2006, I would have assumed that it had been written circa. 1995 or earlier, such is the level of dated thinking it employs.

      • I don't think your point is that far adrift from (Perl) "should not hide the existence of OS-level threads, or fail to provide access to lower level concurrency control constructs".

        The sentence you quote -- "But equally, don't throw the baby out with the bath water. Flames are dangerous; but oh so very useful." -- is (in section 2 of my reply) aimed directly at the paper you cite.

        However, your quote from the P6 spec. is also questionable if the resulting low-level constructs end up being anything like Perl5's.

        Simply exposing the low-level system constructs, especially if they are forced into being POSIX compatible emulations over the underlying OS is arguably worse. You'd just be recreating all the bad bits of the Perl5 implementation.

        And the addition of Futures as the only alternative will do little to make things better. See below.

      • Are you really saying that, when solving any concurrent problem correctly (data-wise), Futures are never (or seldom) simpler for the programmers writing, reading, and modifying the code than directly using the underlying low level concurrency constructs ?

        No. I like Futures a lot. They are the best (cleanest; easiest to grasp), high level threading construct I've yet seen. For a certain class of problems, a well implemented -- and that means NOT the obvious, naive implementation -- they can/could be a very simple and effective solution.

        But, I'm am saying that they move the problems somewhere else.

        In effect, all (naive) Futures do, is wrap-over (hide) the mechanics of ThreadCreate() and ThreadJoin(), And all their problems still persist.

        When a Future comes to be realised -- the thread comes to be joined -- if that thread is not yet finished, it will block. And as soon as you have blocking, you have the potential for deadlocking, live-locking and priority inversion. Except now the locking is hidden from, and inaccessible to the programmer.

        And what's worse, the very simplicity of Futures is -- as happened with the synchronized keyword in Java -- going to encourage people to throw threads at everything without understanding or thinking through the consequences. The very act of providing a simple, high-level construct that hides much of the mechanics of threading and locking, will encourage them to be used widely and inappropriately.

        It is possible to implement the hidden lock such that it will detect deadlocks and produce diagnostics, but the runtime penalties of every implementation I have seen are such that for any performance critical application -- why else are you using threading? -- the penalties add up to the point where programmers will need to revert to the low-level constructs in order to regain the level of control they need to realise the performance they are seeking.

        And finally, there are a whole raft of classes of algorithm for which (naive) Futures are not just inappropriate, but completely unusable. For some of these classes, a sophisticated implementation can mitigate that and render them usable; but that is still the subject of research.

        The bottom line is that to implement Futures such that they are reliable, easy to use and sophisticated enough to allow their use for a wide range of problems, is extremely hard. And testing them even harder. If P6 shipped -- should that mythical event ever occur -- with a naive implementation, or a poorly tested sophisticated one, the damage to its reputation would be huge.

        Now think how long (and how many re-writes) it has taken for Java threading to approach some level of real usability.

      I'd appreciate a link to a decent recent technical discussion of the problem of the heaviness of shared memory in P5 and reasonable potential fixes. Failing that, a couple paragraphs that summarize the problem and the fix you're suggesting would be great. Or maybe you could write a meditation that invites monks to focus on the technical issues related to the problem and potential fixes?

      I started to put something together in reply to Corion's post above, but RL took priority. Give me a week or two and I'll post my demonstration of the problems, highlight (some of) the source(s) of those problems; and offer some suggestions as to how they could be solved (for perl5).


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
Re: The problem with "The Problem with Threads"
by BrowserUk (Patriarch) on Aug 12, 2014 at 22:29 UTC
    And the academic world will rejoice, proclaim her a genius of our time, and no doubt award her a Nobel prize. (That'd be nice!)

    Wish it, and it shall be so :)


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
        In neither link was I able to find that a "mathematician [invented] a symbol for an asynchronous queue". Can anybody help point to that part?
Re: The problem with "The Problem with Threads"
by sundialsvc4 (Abbot) on Jul 18, 2014 at 15:55 UTC

    Good thoughts.   Very.

    The most common mis-use of threads that I see is the “flaming arrows strategy.”   For each request, light another flaming arrow (a thread), and shoot it into the air to land where and how it may and then die.   This is sure to overwhelm almost any scheduler, which also has no real way to distinguish one scheduler-unit-of-work from another.   It also burdens the system with setup and teardown of processes, which is often expensive.

    A far better (and, far more scalable) approach is to do what’s done in, say, any restaurant:   to have a pool of workers who shift their attentions among a greater number of active orders, which are flowing station-by-station through a flexible and well-defined lifespan.   The workers can be generalists, or specialists, or sometimes both, and the allocation can be adjusted at any time.   The system that is built using threads/processes, is very aware of exactly what business-problem it is constructed to solve.   When a unit of work is obliged to wait, a worker is not.   Instead, the order is briefly “parked.”   Workers live to a ripe old age.   If a single work-unit needs to go to several stations at once (burger, fries, milkshake), it might be opportunistically serviced by three workers simultaneously.   Commitments regarding service level are engineered into the system, so that you can say (and measure), that “95% of the time, all orders will be prepared and served to the customer within 3 minutes.”   Concurrency is used to address the problem, but there is not a one-to-one correspondence between workers and work.   There is a dedicated management role, separate from any worker, which is constantly “riding the faders” to keep everything in balance.

    There are plenty of good workload-management systems, including some that are designed to share work in a computing cluster, as well as those that are designed to be self-adapting to changing workload mixes and resource-constraint pressures.   The goal is to find the ever-changing “sweet spot” in which a maximum number of units-of-work are being processed, in the least amount of time, without creating traffic-jams at any point (including the OS scheduler itself).   The principles used are simply taken from the real world of human and industrial processes.

      Almost every sentence in that diatribe is wrong.

      I can't be bothered to explain why any more cos you'll only regurgitate it back to me in the wrong context a week or two from now when you've forgotten where you read it and what it meant.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
      A reply falls below the community's threshold of quality. You may see it by logging in.