in reply to Re^9: Using threads to run multiple external processes at the same time
in thread Using threads to run multiple external processes at the same time

If anything, this only served to increase my confusion.

I installed R on my home E7200 (WinXP) and ran your example - thankfully, it was simple and self-contained.

for /l %i in (1,1,1) do @start /b rscript -e "system.time(for(i in 1:1e4){fisher.test(matrix(c(200,300,400,500),nrow=2))})" user system elapsed 54.50 0.04 54.63 for /l %i in (1,1,2) do @start /b rscript -e "system.time(for(i in 1:1e4){fisher.test(matrix(c(200,300,400,500),nrow=2))})" user system elapsed 55.64 0.00 55.77 user system elapsed 55.92 0.02 56.08 Test of multiple concurrent R processes on Windows XP on a dual-core m +achine(R version 2.9.2)
So at least on Windows, two R processes *can* run alongside each other. I did not have the /affinity switch on XP so the apparently smaller 'concurrency overhead' may be explained by CPU thrashing.

The same thing on Linux, on my work E7300:
$ for i in $(seq 1 1); do Rscript -e 'system.time(for(i in 1:1e4){fish +er.test(matrix(c(200,300,400,500),nrow=2))})' & done user system elapsed 46.694 0.088 47.189 $ for i in $(seq 1 2); do Rscript -e 'system.time(for(i in 1:1e4){fish +er.test(matrix(c(200,300,400,500),nrow=2))})' & done user system elapsed 48.007 0.060 48.188 user system elapsed 47.838 0.072 49.487
One may conclude that an E7300 is measurably faster than an E7200 (big surprise there), that the 'concurrency overhead' is perhaps a bit larger on Linux, and most importantly, that the two concurrent R processes can use the two CPUs just fine.

BUT! (And picture me saying this just like Tim the Enchanter from MP&Holy Grail)

When I tried the specific R commands I need for my processing, I got this:
$ tail Rtest1.txt 0.05 0.05 0.05 0.05 0.05 0.05 0.13 0.08 0.10 0.15 $ wc -l Rtest1.txt 91135 Rtest1.txt $ cp Rtest1.txt Rtest2.txt $ cat Rtest1.R library(splines) library(survival) library(NADA) outcome=scan("Rtest1.txt") cenoutcome=rep(FALSE, length(outcome)) cenoutcome[outcome==min(outcome)]=TRUE pyros=cenros(outcome,cenoutcome) mean(pyros) proc.time() # Rtest2.R is the same, but it reads Rtest2.txt $ Rscript Rtest1.R 2> /dev/null & [1] 0.03195313 user system elapsed 22.201 4.096 26.293 $ Rscript Rtest1.R 2> /dev/null & Rscript Rtest2.R 2> dev/null & [1] 0.03195313 user system elapsed 40.358 6.328 46.839 [1] 0.03195313 user system elapsed 39.706 6.044 48.128
So it is not R itself that has a problem with multiprocessing environments, nor is it my clumsy Perl threaded implementation: it is this specific R package.

BrowserUk, I sincerely thank you for taking interest in my silly little problem. You helped me a lot. I am indebted to you.

Replies are listed 'Best First'.
Re^11: Using threads to run multiple external processes at the same time
by BrowserUk (Patriarch) on Sep 07, 2009 at 19:56 UTC

    6 seconds of system time from 48 elapsed is pretty heavy system usage. It'd be intersting to know what those packages are doing, but I took a look at the short descriptions on CRAN and realised that they are way beyond my knowledge or interest to comprehend. You'll need to take your questions to the R experts now.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.