Yes, monitoring the processors using htop shows that the CPU utilization is pretty low
You are probably right. I create some test datasets to investigate
The vast majority of the files changes with each iteration, so there would be limited benefit in this approach In several parallel workflows, I do a lot of caching, originally with BerkeleyDB but I moved to LMDB about a year ago and got a nice performance bump.