If it were me, I'd put them in directories based on the hashes. I'd put "d41d8cd98f00b204e9800998ecf8427e" in "d/4/1/d/8/d41d8cd98f00b204e9800998ecf8427e" (for example). At five levels deep, each leaf directory would have an average of three files in it (for three million files), so maybe you want just four levels with an average of 45 files each. The deeper you go, the more room to grow.

I find it hard to believe you'll never do a directory listing. Eventually someone will do one on accident. We had a Linux machine where I work brought to its knees by an 'ls' in a directory with too many files. We thought it had died completely, but it eventually came back.

It's possible that ext3 doesn't have this problem (I don't know), but on some filesystems even a check for existence involves a brute force search through the contents of the directory.

Having looked just now, I see there's an option for 'mke2fs' called "dir_index" which "uses hashed b-trees to speed up lookups in large directories." Also, a "tune2fs -l /dev/sda1" tells me that my filesystem has this feature even though I don't recall asking for it. Maybe it's the default. It might be worth your while to look.


In reply to Re: (OT) should i limit number of files in a directory by kyle
in thread (OT) should i limit number of files in a directory by leocharre

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.