in reply to Re^2: perl performance vs egrep
in thread perl performance vs egrep

My first thought about making the searching multi-threaded is that the disk then has to read from multiple files (for each thread). This probably means the read head on the hard drive will be required to move around the surface of the disk more than just reading each file sequentially.

It is hard to predict which approach will give the best read performance. I think it's a reasonable to assume that your OS and filesystem try to keep the files stored sequentially on the disk so I would expect searching each file in sequence is probably faster.

Maybe you might want to time running 'wc' on all of the files in sequence vs all of the files at different concurrencies to see what works best for reading the data.

The whole point is your disk is probably much slower than your CPU so it will probably be a much bigger bottleneck especially if you go and start moving the read head around a lot.