in reply to MCE: Slow IPC between child and gather process in parent
Your listing of Throughput per Thread Count stops at 8 concurrent reads. Disk I/O is usually plotted at NCQ depths of 1, 2, 4, ... 32, and sometimes beyond.
What model SSD is being used? Single disk, RAID, anything unusual about the setup? Streaming read transfer rate?
Did you try running the task with elevated I/O privileges, using nice or whatever the ionice analog is on Mac?
Mmap being faster than read() for small files is curious. Definitely try sysread() with a comfortably large buffer. Straight up read() ought to result in fewer syscalls, so maybe check the number of context switches and page faults for either approach.
Lastly, the number of files is quite large. Did you say what the time was to just find+stat the files (or du the directory)?
|
|---|