There are critical questions you have not answered. What operating system are you running on? (Any form of *nix would make fork work well. Windows does an emulation which could make that solution significantly worse.) What are you expecting to be your performance bottleneck? (CPU? Disk? Network delays?) What kind of hardware are you working on? (Number of CPUs? Number of disks?) Are there significant initialization costs? (eg Database connections cannot be preserved across a fork, and are expensive to create.) How much data needs to be passed around? Is there any possibility of moving this to a cluster?

For an extreme example, if you're using Windows and are expecting to bottleneck on local CPU on a 1-CPU machine, you absolutely should make this job a single process, that is single-threaded.

Suppose that you're bottlenecked on network time delays and there is an Oracle database connection needed per worker. Then you really want several persistent workers. Single process, multiple threads would beat constant forking.

Suppose that you're bottlenecked on disk seek time, you're on a Unix system, and there are no startup costs. Then I would recommend the fork approach.

Suppose that you're bottlenecked on network round trips CPU and there is a possibility of throwing multiple machines at the problem. Then I'd recommend neither of your approaches. Instead I'd look for a way to farm out jobs to multiple processes on multiple machines. One approach is to use a standard clustering solution. A very cheesy approach that I must admit to having used in the past is to make the job run in a webserver, and then use a load balancer to distribute requests. (Hey, I had the webservers already set up and sitting there mostly idle...) Another interesting approach is to have a database table with a table for open jobs. Then have workers on multiple machines query it. (I set up a batch processing system on this principle and it worked well. It was suggested to me by a former boss who had set up a swaps trading system on the principle, with some of the "workers" for some types of jobs really being people.)

Every one of these solutions and more have been successfully used. Every one has advantages and cases where it is best. Anyone who gives you an absolute answer saying that one of them is always the right way to go doesn't know what they are talking about.

I didn't really answer your question. But hopefully I gave you enough to think about that you can have a better chance of coming up with the right solution for your situation. Oh, and I gave you a few more options to consider. :-)

Update: I messed up one of my examples. If you're bottlenecked on network round trips then a single machine should be able to run enough copies to move the bottleneck to the server on the other end. In which case there is no need to complicate things with the cluster. But if CPU is your problem then you would want to split up work onto multiple machines.


In reply to Re: best strategy by tilly
in thread best strategy by libvenus

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.