in reply to using Linux getdents syscall

Quoting the man page of getdents:

These are not the interfaces you are interested in. Look at readdir(3) for the POSIX-conforming C library interface.

That's all that you need to know about getdents(2). Perl has a readdir function that calls readdir(3) internally, and I'm quite sure it is optimized. Readdir(3) itself is most likely implemented in the libc as calling getdents(2), with a fallback to readdir(2) for older kernels.

I'm looking for a fast way to list the contents of a directory (with thousands of files) on Linux by using Perl.

opendir, readdir, closedir. Benchmark that. Compare with ls. Most likely, you won't get faster than that, simply because perl has higher startup costs and does not run native code, but instead follows a complex data structure representing your perl script.

My guess is that the bottleneck is the disk and its interface, not the actual functions called to read the directory. Sure, libc and perl add some overhead, but not that much.

Alexander

--
Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

Replies are listed 'Best First'.
Re^2: using Linux getdents syscall
by glasswalk3r (Friar) on Nov 24, 2015 at 12:10 UTC

    I don't care about portability or be POSIX conform... I need to erase lots of files the faster (and with low overhead) as possible in Linux.

    readdir works fine, but when the directory starts having thousands of files performance starts slowing down.

    It just occurred to me right now that I could check if readdir does a stat system call in each file inside the directory... that would explain why Perl code to clean the directory is slower. But wouldn't help me solve this issue anyway.

    Alceu Rodrigues de Freitas Junior
    ---------------------------------
    "You have enemies? Good. That means you've stood up for something, sometime in your life." - Sir Winston Churchill

      There has already been lots of great input on this topic. I'll add that there is no faster way on a *nix box to delete a large number of files from a directory than using xargs.

      Either with:  ls [SOME MASK] |xargs rm

      or ls |grep [SOME MASK] |xargs rm

      --
      “For the Present is the point at which time touches eternity.” - CS Lewis