Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
The memory size is only part of the problem. The problem with spawning too many children also has to do with context switches.

I am regularly working with seven separate databases, each having eight sub-databases (and huge amount of data). So, in principle, I could launch 56 parallel processes for extracting data from these 56 data sources. Most of the time, I don't use forks or threads, but simply launch parallel background processes under the shell. And the processes that I am referring to are sometimes in Perl, and sometimes in a variety of other languages (including a proprietary equivalent of PL/SQL), so that using the OS to fork background processes is often the easiest route: the shell and the OS manage the parallel tasks, the program manages the functional/business requirements.

Our servers have usually 8 CPUs. We have been doing this for many years and have tried a number of options and configurations, and we have found that the optimal number of processes running in parallel is usually between 8 and 16, depending on the specifics of the individual programs (i.e. some are faster with 8 processes, some better with 12 and some better with 16, depending on what exactly they are doing and how).

If we use less than 8 processes, we have an obvious under-utilization of the hardware; if we let 56 processes run in parallel, the overall process takes much longer to execute than when we have 8 to 16 processes, and we strongly believe that this is due to context switches and memory usage. In some cases (when the extraction need heavy DB sorting, for example), the overall process even fails for lack of memory if we have too many processes running in parallel. But in most of the cases, it really seems to be linked to having too many processes running for the number CPUs available, leading to intensive context switches.

So what we are doing is to put the 56 processes into a waiting queue, whose role is just to to keep optimal the actual number of processes running in parallel (8 to 16 depending on the program). This is at least the best solution we've found so far. With about 15 years multiplied by 5 or 6 persons of cumulated experience on the subject. Now, if anyone has a better idea, I would gladly take it up and try it out.


In reply to Re^2: Design advice: Classic boss/worker program memory consumption by Laurent_R
in thread Design advice: Classic boss/worker program memory consumption by shadrack

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (3)
As of 2024-04-19 17:35 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found