Is a double loop over 1,000,000 and 20,000 not big enough for fork. I need the process to be finished in less than 30 sec., while using all cores. The sleep() example is difficult to compare to the issue I outlined. I could not find in the documentation how the forking is implemented.