in reply to Design flat files database
what should work faster for access user dir with id 748332: /74/83/32/748332/ or /7/4/8/3/3/2/748332
The reasoning behind placing files in directory structures formed by partitioning the name-space is to avoid huge numbers of files in a single directory which (on some file systems) have to be search linearly.
Eg. If your 6-digit ID defines your ID space, then placing 1 million files in a single directory means on average, you have to inspect 500,000 entries to find the file you are looking for.
But, if you split that into /xx/yy/zz.dat, then on average you will inspect 50 entries in the first level, 50 in the second and 50 in the final level. !50 inspections .v. 500,000 is a good trade.
Using (a modified version of) your second schema /p/q/r/x/y/z.dat, it will (on average) be 5 in each of the 6 levels giving 30 inspections.
The latter sounds like a good idea, but in practice the benefits can be outweighed by the complexities. This depends upon the actual file-system in use, and you will need to test to see what works best on your particular file-system.
Files in linux directory are indexed
Again, this depends upon the file-system in use. AFAIK, ext2/ext3 are not indexed (or hashed), but other *nix file-sytems may be.
For example someone posted a message, what better : to save all the replies for this message in a singe file or save each reply in separate file in the folder that will be created for this message and when someone view the message to gather all the replies from the files
Reading between the lines, I'm guessing your thinking of implementing a message-board type system (not unlike PM).
If so, the "better" will depend upon many factors:
A comment: Your proposed schemas /74/83/32/748332/ & /7/4/8/3/3/2/748332/ both incorporate two levels of redundancy. There is no benefit in this.
Two questions:
If there a tutorial or book about flat files database it will be great !
The only paper I ever saw on the subject was an IBM RedBook, but that was 15 or 20 years ago, so my memory of it is vague. You could try searching that site, but I don't have any good keywords to offer you right now. Maybe some will come back to me.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Design flat files database
by AlfaProject (Beadle) on Jul 15, 2011 at 12:07 UTC | |
by BrowserUk (Patriarch) on Jul 15, 2011 at 15:55 UTC | |
by AlfaProject (Beadle) on Jul 15, 2011 at 21:15 UTC |