Re: Re: Re: Fast way to read from file

If you can spend the time (and the I/O) for a complete read before you start to use the file, you do not actually need to create a complete index. You could have an index that indexes every 10 (or 100) lines, so that you approximately know where you are reading, so you only have to read a small number of useless lines before the one you are actually looking for. This might be a viable solution.

If your lines have a line number that's univocally understandable, you could use a binary partition method to find your line in about log2(n) passes without using an index. You should:

Get the total size of the file
Read the first complete line that's about the half of the file
If it's over the line you are looking for, repeat the process seeking the middle point between the middle of the file and the end of the file;
if it's below, repeat seeking the middle point between 0 and the middle of the file
if it's it, you return it and exit

You have to take care that if the line you are looking for does not exist in the file, you cannot loop indefinitely (maybe set a maximum number of steps before bailing out).

If the line length of the lines within the files is expected to be more or less even, you could also think about a prediction method to get the expected point. It goes like this:

We read a line in the middle of the file
We now know it's line number (say, 10038) and that it's located at offset 120046
We expect the average line to be 11.9 chars long
If we are looking for line 9500, we start seeking at offset 9500*11.9 = 113050 and the we adjust our estimate based on what we actually find

Of course all those things have to be thought of seeing the actual case.

Comment on Re: Re: Re: Fast way to read from file