in reply to Re^2: Advice on Efficient Large-scale Web Crawling
in thread Advice on Efficient Large-scale Web Crawling
I suggest you simply change that to two hex digits per directory name, e.g.
pool/todo/a6/86/a6869c08bcaa2bb6f878de99491efec4f16d0d69That should reduce the average number of files per directory to a much more reasonable 60 and change.
And yes, benchmarking (lots and lots of benchmarking) and tweaking seem to be the best way to tackle this kind of problems.
|
---|