in reply to Performant Path Iteration

some thoughts
  1. Are you sure that reading from the filesystem is the bottleneck?
  2. Your results are very dependent on the topology, I'd say readdir has a better performance on "bigger" directories
  3. Do these files always change? Otherwise caching them in a DB and only refreshing them if the update timestamp of the directory changed might be worth a consideration.

Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery FootballPerl is like chess, only without the dice

Replies are listed 'Best First'.
Re^2: Performant Path Iteration
by learnedbyerror (Monk) on Apr 23, 2019 at 04:24 UTC

    Answer by point

    1. Yes, monitoring the processors using htop shows that the CPU utilization is pretty low
    2. You are probably right. I create some test datasets to investigate
    3. The vast majority of the files changes with each iteration, so there would be limited benefit in this approach In several parallel workflows, I do a lot of caching, originally with BerkeleyDB but I moved to LMDB about a year ago and got a nice performance bump.

    Thanks for your response, lbe