In another situation, I'm making a large ugly GUI program (uses Gtk2), and I want the GUI keep active while background work is running. However, there are some complexed data structure that need to be accessed by both the GUI and worker thread, so I really wants a "real" thread that shares everything by default, but not copying.

Remember I'm saying this without the benefit of your knowledge of the actual application, but in general, a GUI doesn't need access to large volume of shared data. Gui's by their nature are for presentation, and presenting large volumes of data to the user means they have to scroll up and down or side to side trying to find anything of interest.

Two representative scenarios come to mind:

  1. Allowing the user to select a subset of the data for input to a processing algorithm.

    Typically, the user is shown the first 20 or 50 rows of the data at the top of a huge scrolling list. They can scroll up and down, but they only see a small window on the data at a time. They may be able to select row or items individually, but if the dataset is large, say more that a few hundred, it is a very clumsy process to visually select the appropriate subset for processing.

    So, the next obvious step is that some mechanism is provided that allows them to specify criteria--a wildcard; this field==xyz--that causes the program to perform the sub-selection for them. Now the list is shorter, and they may choose to scan the shortened list before committing it for processing.

    Once they are happy. only the selected subset of the data in required by the work thread(s). So, there is no reason to have all threads share all the data.

    Indeed, in many situations, each of your worker threads only needs access to it's proportionate subset, of the original subset. In many more, each worker only needs access to one work item at a time.

  2. Allowing the user to monitor the output of an algorithm.

    As the background workers work, the GUI's role is little more than to keep the user appraised that they are working. This is often done by presenting the user with an auto-scrolling list of the results as they are produced.

    Which might look cool and give the user "warm fuzzies", but computers produce results far faster than the human eye can register, never mind the speed human brain can digest for the purpose of making decisions. The huge scrolling list only serves as a reassurance that things are progressing happily and haven't bogged down for any reason.

    And that goal is often better and more cheaply achieved, by presenting the user with one or more scrolling counts, or a even just a constantly updating progress bar.

In both the above scenarios, the intuitive solution is to load all the data into a single, huge, shared data structure so that whatever thread needs access to any part of it, has direct access to it.

But the reality is, that this intuition is often naive and misleading. Regardless of whether you are using ithreads explicitly-shared-only model; or C's everything-shared model. This because if all threads have access to all the data, it means that every thread has to employ locking to every read and write. And the overhead of locking is the biggest barrier to efficient use of threading.

The way to avoid that overhead, is to avoid sharing data that doesn't need to be shared. Rather than giving all threads access to all the data; give each thread access to only that data it needs to process. In that way, each thread can be written in the knowledge that only it has access to it's particular subset of the data, and therefore can dispense with locking entirely. And the benefits of that for both the simplicity of the coding, and the performance of the algorithm, are profound. Again this is true whether we're talking about Perl threading, or any other language.

Not all algorithms fit the above observations, but many more do than don't. Though sometimes it requires you to look at the problem with fresh eyes to see the no-sharing, therefore lock-free, solution.

I too wish that the threads::shared mechanism was lighter, and I have recently begun to get inklings of a possibility that might allow it, or something like it, to be so--but don't hold your breath. As things stand, there are often--even usually--ways of approaching most data sharing problems that work with the strengths of the ithreads model rather than against them.

If you have particular real-world scenarios that you are finding the sharing model a particular limitation, please share [sic] them (in detail). Whilst I make no promises of a ready solution, the more eyes that see the problems, the more likely a solution will be forthcoming. And even if none is, having known, real limitations documented may lead to better things down the road.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
RIP an inspiration; A true Folk's Guy

In reply to Re^5: issue of concurrency: which module is better by BrowserUk
in thread issue of concurrency: which module is better by llancet

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.