in reply to Re: ithreads or fork() what you recommend?
in thread ithreads or fork() what you recommend?

Hi BrowserUk,

Basically there is not much shared data between sub tasks. And I can avoid much of that. But only thing they need to share (read and write both) using file is - how many processes of one type are in execution- just number like if there are 5 active sub task of one category or 6. Based on the max concurrency limit of overall tasks and sub tasks type, it can start a new sub task and increment that number in that shared file. The read and write operations to shared file will be over in fraction of second. And to avoid any deadlock, I can use wait of some seconds (while loop with exit condition) in case a process is not able to read or write to the shared file when it wants to do so. For just read operation on shared files, I feel there is no need to worry about synchronization between sub tasks, simple read retry would be sufficient.

Based on the detailed explanation you have given, I feel that multiprocessing using fork() would be more appropriate then threads. I thought of using threads due to only one reason - it would have avoided significant code change and still I would have benefited by parallel processing of sub tasks.

To answer query about system - it is SUN high end server with 40+ CPUs and 48 GB RAM with Solaris 10 OS. Perl modules use APIs of enterprise product to perform various operations related to the product (Perl is handling both automation and complex business logic) using input feed from CSV file.

Currently it is kind of sequential approach with only parallel processing at the Product API level using fork(). I have to change it to end to end parallel processing (it is possible to logically group sub tasks) to reduce processing time, which is heavily dependent on enterprise product. But I see parallel processing giving around 40-50% (10 Hours) reduction in overall processing time, hence this question.

I have to confess, I learned some really deep things from the the answers given to my question! And now I feel that fork() would be better option in this case, with only overhead of writing a lot more code to get this enabled :-).

Thanks a lot to you and others for giving valuable suggestions and insight into this parallel processing options using threads and fork().

Best regards, Pawan
  • Comment on Re^2: ithreads or fork() what you recommend?

Replies are listed 'Best First'.
Re^3: ithreads or fork() what you recommend?
by flexvault (Monsignor) on May 19, 2012 at 09:34 UTC

    pawan68923,

      ...only thing they need to share (read and write both) using file is - how many processes of one type are in execution...

    I would have the the parent keep these counters in memory. From your description the parent starts the processes and maintain the counts. Why use a file?

    Some things to consider:

    • Test your theory. That many CPUs may act different than you (or I) might think. I have many times assumed a solution to find that it didn't scale well. You could have dependencies that you don't even know exist.
    • Always lock even it's "read only". 'flock' is trivial compared to the time fixing a solution that now will "read and write" only one thing.

    Sounds like a massive undertaking -- should be fun!

    Good Luck!

    "Well done is better than well said." - Benjamin Franklin

      Hi Flexvault,

      Its my mistake when I mentioned use of file to count child process state, actually I used parent's counter in memory to do that in the test code I wrote as skeleton of the modified solution. I will make use of your advise. Thanks much!

      Best regards, Pawan#
Re^3: ithreads or fork() what you recommend?
by BrowserUk (Patriarch) on May 19, 2012 at 14:38 UTC
    Based on the detailed explanation you have given, I feel that multiprocessing using fork() would be more appropriate then threads. I thought of using threads due to only one reason - it would have avoided significant code change and still I would have benefited by parallel processing of sub tasks.

    Hm. Nothing in the sparse details you've outlined gives me cause to reach that conclusion; especially if -- as you've suggested -- using fork would require a substantial re-write.

    Let's say the basic structure of your current serial application is something like:

    #! perl -slw use strict; use constant { TOTAL_JOBS => 130, }; for my $job ( 1 .. TOTAL_JOBS ) { open my $in, '<', 'bigdata.' . $job or die $!; my @localData = <$in>; close $in; ## do stuff with @localData }

    Then converting that to concurrency using threads could be a simple as:

    #! perl -slw use strict; use threads; use constant { TOTAL_JOBS => 130, MAX_CONCURRENT => 40, }; for my $job ( 1 .. TOTAL_JOBS ) { async { open my $in, '<', 'bigdata.' . $job or die $!; my @localData = <$in>; close $in; ## do stuff with @localData }; sleep 1 while threads->list( threads::running ) >= MAX_CONCURRENT; $_->join for threads->list( threads::joinable ); } sleep 1 while threads->list( threads::running ); $_->join for threads->list( threads::joinable );
    But I see parallel processing giving around 40-50% (10 Hours) reduction in overall processing time, hence this question.

    Given the capacity of the hardware you have available, I could well see the above reducing the runtime to less that 5% of the serial version; though the devil is in the details you have not provided.

    Of course, using Parallel::ForkManager should allow a very similar time reduction, using a very similar minor modification of the existing code.

    Why you feel that using fork should require a substantial re-write is far from obvious from the scant details you've provided. Ditto, the need for file-based counting and locking.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

    The start of some sanity?

      Hi BrowserUk,

      Why you feel that using fork should require a substantial re-write is far from obvious from the scant details you've provided.

      Basically current setup is sequential in nature and it is dependent on the product on top of which it is doing all jobs. There is logical grouping of input feed information as well as its processing using APIs of the base Product, which is not used in the current implementation. Though there are separate Perl modules implementation to do different tasks, it has not utilized logical grouping of task in the way base product supports, also it is stage based processing of complete input feed as one set, which prohibits parallel processing of logical group task. Also there is no specific need of shared data between sub tasks, they can execute in parallel without using others data or common data. So I feel fork() can be more suitable to this scenario. To enable parallel processing from start to end, the current implementation has to be modified to use the preferred logical grouping supported by product. I am not sure if I answered your question satisfactory, but going through the current code I can see some good amount of change, it would not be in 5-10k lines, but it is impacting good amount of code :-)

      Nothing in the sparse details you've outlined gives me cause to reach that conclusion

      It is also because of these reasons also:

      Do your subtasks need read-write access to shared data? If not, why are you considering threads?

      Is the 1GB of data per subtask different for each subtask, or shared by all the subtasks? If the latter, then you may well find that the process model is more economic because of copy-on-write.

      To

      Given the capacity of the hardware you have available, I could well see the above reducing the runtime to less that 5% of the serial version; though the devil is in the details you have not provided.

      The correct logical groupings of sub tasks and concurrent processing is the main factor in reduction of overall processing time, it will align with the preferred approach product support.

      Ditto, the need for file-based counting and locking.

      Its my mistake when I mentioned use of file to count child process state, actually I used parent's counter in memory to do that in the test code I wrote as skeleton of the modified solution.

      Thanks again for providing insight on this subject! I will make use of all the suggestion given so far. Thanks again!

      Best regards, Pawan