Sorry to labour the point, but having cast around for references I can't see how parallelizing an IO-bound process could ever reap performance benefits on "commodity hardware"--defined for the purposes of discussion as a single cpu system with a single harddrive.
If one process is IO-bound, then by definition, it spends most of it's time waiting for the OS kernel to complete it's IO requests. Ie. The route through the kernel IO routines, device driver and disk drive hardware is the limiting factor.
If you start a second copy of the process, then it will spend it's time waiting for it's IO requests to complete, but it will also have to wait for the IO chain to complete the first processes IO requests before it gets around to attempting to service those from the second process.
The bottleneck, wherever in the IO chain it falls, will not suddenly widen because a second process starts making requests. The only way that could happen is if the kernel held some percentage of it's potential throughput 'in reserve' for second and subsequent processes.
That's not a completely rediculous idea. Certainly some network protocols do something akin to this. SNA for example would resist allocating all the bandwidth of any given point to point link to a single end-to-end connection and would attempt to always hold some bandwidth in reserve for low-volume high priority traffic. Years ago, I remember breaking 1.4MB diskette images into smaller chunks for transmission across SNA networks as smaller transfers were always given priority over larger ones. But I've never heard, nor found reference to any dd or bus management system that does anything similar.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
|