Re: Performant Path Iteration

some thoughts

Are you sure that reading from the filesystem is the bottleneck?
Your results are very dependent on the topology, I'd say readdir has a better performance on "bigger" directories
Do these files always change? Otherwise caching them in a DB and only refreshing them if the update timestamp of the directory changed might be worth a consideration.

Cheers Rolf
_{(addicted to the Perl Programming Language :)

Wikisyntax for the Monastery
FootballPerl is like chess, only without the dice}

Comment on Re: Performant Path Iteration

Replies are listed 'Best First'.
Re^2: Performant Path Iteration by learnedbyerror (Monk) on Apr 23, 2019 at 04:24 UTC
Answer by point Yes, monitoring the processors using htop shows that the CPU utilization is pretty low You are probably right. I create some test datasets to investigate The vast majority of the files changes with each iteration, so there would be limited benefit in this approach In several parallel workflows, I do a lot of caching, originally with BerkeleyDB but I moved to LMDB about a year ago and got a nice performance bump. Thanks for your response, lbe	[reply]

Replies are listed 'Best First'.

Re^2: Performant Path Iteration
by learnedbyerror (Monk) on Apr 23, 2019 at 04:24 UTC

Answer by point

Yes, monitoring the processors using htop shows that the CPU utilization is pretty low
You are probably right. I create some test datasets to investigate
The vast majority of the files changes with each iteration, so there would be limited benefit in this approach In several parallel workflows, I do a lot of caching, originally with BerkeleyDB but I moved to LMDB about a year ago and got a nice performance bump.

Thanks for your response, lbe