I usually start with this: How to create thread pool of ithreads

I have also asked a question about sharing complex/nested data between threads here ithreads, locks, shared data: is that OK?.

You seem to need to share only an array and hash of scalars, which is so much simpler and faster for you and can be done via Thread::Queue (edit: see also: Re: Multi-thread combining the results together) for the array and see below for the hash. You may be tempted to share a hash between threads to store the %result. I would say don't in this case, because data will be duplicated in each thread. It is not clear to me if using threads::shared will actually share a reference with locking and semaphors or it does a transparent and sophisticated data duplication behind the scenes, from manual:

By default, variables are private to each thread, and each newly created thread gets a private copy of each existing variable. This module allows you to share variables across different threads

In order to avoid sharing a hash I use this trick: push into the done queue a string like "$token=[@line_results]". When all threads are joined I convert the strings to the results hash.

It will make a huge difference in performance if you minimise your reads/writes to the Queue by, for example, read (edit: and dequeue!) all the data you will need for that particular thread once at the beginning instead of in a loop. Do processing and write results to a temporary thread-private variable. Write that variable in one go to the Queue when done in order to eliminate the locking and unlocking each time you write to the Queue...

So, reducing your running time proportionally to the number of threads is a holy grail as there are data read/write costs. Which proves that parallelism can sometimes be worse for performance! Ah the eternal battle between cooks and romantics ("too many cooks in the kitchen" vs "many hands make light work"). Aim to share as little as possible...

Of course nothing stops you from requesting a memory segment shareable to all processes/threads via IPC::Shareable. There you can de-/serialise any complex data structure to be shared but you will need to implement your own locking. Recent article with some code: Re: IPC::Shareable sometimes leaks memory segments

bw, bliako


In reply to Re: Multi-thread combining the results together by bliako
in thread Multi-thread combining the results together by Marshall

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.