in reply to Quickest way to get the oldest file

Others have suggested you use a database and I tend to agree with them. It's unclear whether you meant 50,000 files or 50,000 directories each with some number of files but either way, searching through that many entries in a filesystem is slow (and hopefully it's not 50,000 files all in one directory!) Another mechanism that you might be able to use is encoding the time information in the file/directory names themselves. Something like:

2004/01/19/00/filename.01 2004/01/19/00/filename.02 2004/01/19/00/filename.03 ... 2004/01/19/01/filename.01 ... 2004/01/19/02/filename.01 ...

In this hypothetical example the files are arranged by year/month/day/hour/filename.minute. You can see that it would be relatively easy to find the oldest file if you could arrange for such a structure. I don't exactly know if this technique would be useful to your problem, but there it is.

Or you could just use a database like postgres, mysql, berkeley DB, etc. (I believe all of these are available on both linux and windows) with an index created on the time of each entry. :-)

Replies are listed 'Best First'.
Re: Re: Quickest way to get the oldest file
by Anonymous Monk on Jan 20, 2004 at 20:24 UTC
    Thanks all for your replies and helpful suggestions,

    There will be multiple directories (but I will only be concerned with one at a time), the one that I will be using will be specified, and this may have up to 50,000 files in the one directory. This is the most that we have ever seen in one directory, but it is normally more like 1000. I do want to plan for the worst case scenario though. Unfortunately a file and directory naming sturcture like what you are suggesting is not an option, as this will be specified by the client.

    I ran gaff's code from above and found it to be a reasonable speed. Perhaps a little slow when tested against 50,000 files but I only ran it on this slow machine.

    The only other idea that I had was to keep a record of the oldest file in a text file, and then this will only need to be updated when a file in that directory is deleted, or more to the point when that particular file is deleted. So I could return a response to the client, and then continue on with finding the oldest file, in the background...
    Any thoughts?