I think that I've pretty much confirmed that R does some serialisation of resource usage across concurrent process instances. Even though the evidence is not as clear cut as I would like. In the following console trace, I start 1, 2, 3 & 4 concurrent copies of R performing a ~60 second calculation and timing it with its own built-in timer. I set the affinity of each process to 1 of my cpus to prevent cpu thrash:

for /l %i in (1,1,1) do @start /b /affinity %i rscript -e"system.time(for(i in 1:1e4){fisher.test(matrix(c(200,300,400,500),n +row=2))})" user system elapsed 62.24 0.00 62.68 for /l %i in (1,1,2) do @start /b /affinity %i rscript -e"system.time(for(i in 1:1e4){fisher.test(matrix(c(200,300,400,500),n +row=2))})" user system elapsed 65.49 0.00 65.60 user system elapsed 65.19 0.01 66.13 for /l %i in (1,1,3) do @start /b /affinity %i rscript -e"system.time(for(i in 1:1e4){fisher.test(matrix(c(200,300,400,500),n +row=2))})" user system elapsed 65.61 0.06 65.94 user system elapsed 65.75 0.03 98.98 user system elapsed 65.55 0.00 99.30 for /l %i in (1,1,4) do @start /b /affinity %i rscript -e"system.time(for(i in 1:1e4){fisher.test(matrix(c(200,300,400,500),n +row=2))})" user system elapsed 68.83 0.00 69.81 user system elapsed 70.59 0.00 72.71 user system elapsed 67.30 0.03 101.99 user system elapsed 67.22 0.00 102.65
  1. For 1 copy, it maxes out the appropriate cpu for ~62 seconds, and the elapsed time closely reflects the cpu time used.
  2. For 2 copies, it almost maxes out the two cpus, but both processes show an ~5% 'concurrency overhead'.
  3. With 3 copies, again 2 cpus are maxed, but the third show a less than 50% duty until the first completes at which point it also (almost) maxes.

    The cpu times of the 3 processes all show the ~5% concurrency overhead--probably indicative of some internal polling for resource--but the elapsed times show much greater overhead--nearly 70%.

  4. Once we get to 4 copies, the activity traces show 2 maxed and 2 well below 50% until one completes, at which point one of the other two picks up. And same again once the second completes.

    That pretty much nails it (for me) that there is some one-at-a-time resource internal to R that concurrent processes compete for. And the more there are competing, the greater the cost of that competition.

    All of which probably reflects Rs long history and its gestation in the days before multi-tasking concurrent cpu-bound processes was a realistic option.

Note: It could be that the shared resource is something simple like the configuration file or history file or similar; and that with the right command line options to disable the use of those, the overhead can be avoided. I haven't explored this. It might be better to ask the guys that know rather than testing random thoughts.

Whilst looking around I did come across Rmpi which might be a better approach to your problem. Though it appears to be aimed at spreading the load of single calculations over multiple cpus or boxes, rather than running multiple concurrent calculations. You'd need to read a lot more about than I've bothered with :)


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
RIP PCW It is as I've been saying!(Audio until 20090817)

In reply to Re^9: Using threads to run multiple external processes at the same time by BrowserUk
in thread Using threads to run multiple external processes at the same time by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.