I guess the equivalent Perl5 is to create an iterator and threads and one thread pushes onto the queue and another one takes off the queue? So you have essentially a bucket brigade of threads. Each thread performs a filtering process and passes on to the next filter. So what is the approach to scale this idea so that each filter can be replaced / modified / added / deleted?

Unfortunately, the overhead of Perl's implementation of shared data is such that passing individual pieces of data between threads becomes prohibitively expensive.

This can be mitigated to some extent by batching the individual elements into chunks. This is fairly easy to do using a wrapper class over the Thread::Queue module; or better yet, a custom queue module that avoids locking contention by batching to a non-shared buffer and only locking when a buffer is ready to be exchanged.

Another alternative it to use either pipes or sockets for the transfer of data; both of which do their own batching (buffering), but unless you carefully design the size of the elements to fit those buffers, it can lead to less than optimal transfers because of elements straddling buffer boundaries.

The fastest, most efficient solution is to drop into C for the queue, and bypass Perl's shared memory entirely. Unfortunately, the per thread, memory allocation pool model employed by threaded perl's makes this far more difficult to realise correctly than it should be.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
In the absence of evidence, opinion is indistinguishable from prejudice.

In reply to Re^3: [OT] Software Architecture - Pipe and Filter by BrowserUk
in thread [OT] Software Architecture - Pipe and Filter by trippledubs

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.