Hi BrowserUk,

Why you feel that using fork should require a substantial re-write is far from obvious from the scant details you've provided.

Basically current setup is sequential in nature and it is dependent on the product on top of which it is doing all jobs. There is logical grouping of input feed information as well as its processing using APIs of the base Product, which is not used in the current implementation. Though there are separate Perl modules implementation to do different tasks, it has not utilized logical grouping of task in the way base product supports, also it is stage based processing of complete input feed as one set, which prohibits parallel processing of logical group task. Also there is no specific need of shared data between sub tasks, they can execute in parallel without using others data or common data. So I feel fork() can be more suitable to this scenario. To enable parallel processing from start to end, the current implementation has to be modified to use the preferred logical grouping supported by product. I am not sure if I answered your question satisfactory, but going through the current code I can see some good amount of change, it would not be in 5-10k lines, but it is impacting good amount of code :-)

Nothing in the sparse details you've outlined gives me cause to reach that conclusion

It is also because of these reasons also:

Do your subtasks need read-write access to shared data? If not, why are you considering threads?

Is the 1GB of data per subtask different for each subtask, or shared by all the subtasks? If the latter, then you may well find that the process model is more economic because of copy-on-write.

To

Given the capacity of the hardware you have available, I could well see the above reducing the runtime to less that 5% of the serial version; though the devil is in the details you have not provided.

The correct logical groupings of sub tasks and concurrent processing is the main factor in reduction of overall processing time, it will align with the preferred approach product support.

Ditto, the need for file-based counting and locking.

Its my mistake when I mentioned use of file to count child process state, actually I used parent's counter in memory to do that in the test code I wrote as skeleton of the modified solution.

Thanks again for providing insight on this subject! I will make use of all the suggestion given so far. Thanks again!

Best regards, Pawan


In reply to Re^4: ithreads or fork() what you recommend? by pawan68923
in thread ithreads or fork() what you recommend? by pawan68923

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.