Thanks for the reply. I'm definitely learning something here. Just to clarify, you are suggesting to experiment to find the optimum number of simultaneous instances of my benchmark program. You mentioned running them in batch. Would this best be done using the afore mentioned Parallel::ForkManager module? I've been reading up on fork. I don't believe that the plain fork function has the ability to control the number of children, does it? Is there a general rule to tell whether a process is CPU or I/O bound?