Re: Help performing "random" access on very very large file

Well, some obvious questions are: how many times will you use the file (ie will you resuse the same file in later runs)? For each use, what percentage of the lines will you need to randomly seek to? How many times will you access each line?

Dave.

Comment on Re: Help performing "random" access on very very large file

Replies are listed 'Best First'.
Re^2: Help performing "random" access on very very large file by downer (Monk) on Jul 16, 2007 at 14:29 UTC
I will be making a lot of accesses into the file (~5 million?) I know that some lines will be accessed more often than others, but as far as I can tell now, the accesses will be be at least initially uniform. Lines will be looked up more than once, so some smart caching would be beneficial, but until I start making the calculations, Its hard to tell which lines will be accessed more. Of course I have a few big disks, I can just make a few copies of the file and parallelize the task.	[reply]

Replies are listed 'Best First'.

Re^2: Help performing "random" access on very very large file
by downer (Monk) on Jul 16, 2007 at 14:29 UTC

I will be making a lot of accesses into the file (~5 million?) I know that some lines will be accessed more often than others, but as far as I can tell now, the accesses will be be at least initially uniform. Lines will be looked up more than once, so some smart caching would be beneficial, but until I start making the calculations, Its hard to tell which lines will be accessed more. Of course I have a few big disks, I can just make a few copies of the file and parallelize the task.

[reply]