Welcome to a real world demonstration of why algorithm efficiency matters. :-)

The likely reason why stat is so slow is that each stat has to traverse the entire directory linearly to find the file that you are interested in. That means that you are going back to the directory structure 10,000 times, each time scanning on average about 5,000 entries to find the file of interest. The resulting 50,000,000 fetches of information is what is taking the time.

You can verify this by printing every 100'th filename. You should see a progressive slowdown as the stats slow down due to having to scan more and more previous files.

You would speed this up by a factor of 10 if you arranged to have 10 directories of 1000 files each. You would speed up by a much larger factor with a filesystem which was designed to handle directories with many small files. (Think ReiserFS on Linux.)

The ideal answer, of course, would be to have your direct pass through the directory structure pull back not just the name, but also the associated metadata. That is what ls and dir do, and it is why they are so much faster. Unfortunately Perl's API doesn't give you direct access to that information.

An incidental note. Using Perl does not guarantee portable code. For instance your use of lc will cause you porting problems on operating systems with case-sensitive filesystems (like Unix). (You will be statting a different file than you saw, in fact likely one that isn't there.)


In reply to Re (tilly) 1: I'm falling asleep here by tilly
in thread -s takes too long on 15,000 files by ishk0

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.