in reply to Re^7: Algorithm advice sought for seaching through GB's of text (email) files
in thread Algorithm advice sought for seaching through GB's of text (email) files
Sorry to labour the point, but having cast around for references I can't see how parallelizing an IO-bound process could ever reap performance benefits on "commodity hardware"--defined for the purposes of discussion as a single cpu system with a single harddrive.
If one process is IO-bound, then by definition, it spends most of it's time waiting for the OS kernel to complete it's IO requests. Ie. The route through the kernel IO routines, device driver and disk drive hardware is the limiting factor.
If you start a second copy of the process, then it will spend it's time waiting for it's IO requests to complete, but it will also have to wait for the IO chain to complete the first processes IO requests before it gets around to attempting to service those from the second process.
The bottleneck, wherever in the IO chain it falls, will not suddenly widen because a second process starts making requests. The only way that could happen is if the kernel held some percentage of it's potential throughput 'in reserve' for second and subsequent processes.
That's not a completely rediculous idea. Certainly some network protocols do something akin to this. SNA for example would resist allocating all the bandwidth of any given point to point link to a single end-to-end connection and would attempt to always hold some bandwidth in reserve for low-volume high priority traffic. Years ago, I remember breaking 1.4MB diskette images into smaller chunks for transmission across SNA networks as smaller transfers were always given priority over larger ones. But I've never heard, nor found reference to any dd or bus management system that does anything similar.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^9: Algorithm advice sought for seaching through GB's of text (email) files
by tilly (Archbishop) on Sep 25, 2006 at 15:16 UTC | |
by BrowserUk (Patriarch) on Sep 25, 2006 at 16:14 UTC |