in reply to Very large text file - simple indexing
Your post is a little confusing. In the sample data, you show field_1 as numeric and apparently incrementing by one, but it doesn't start from either zero or one, and in the sample code, you are using this numeric field as the index into an array ($index[$temp[0]]=$temp[1];), then immediately following it you show $index(field 1 value) = field 2 value using parens rather than square brackets.
If the answer to all these questions is yes, then possibly the easiest solution to the problem would be to use Tie::File. Read the excellent documentation for this module for the full nitty-gritty, but simply stated, it allows you, with a single statement, to treat a file as an array. Once you have tied the array to the file, you can just use the array as if it were entirely in memory and it takes care of caching, flushing, opening & closing it. You can specify how much memory you wish to allocate to the caching of the file and thereby make your own choices about the trade-off between memory use and performance.
The only downside given your file format is that each array element would contain both fields, but it would be a fairly trivial process to modify the module for your own purposes to remove field_1 on the FETCHes and replace it on STOREs.
If not all the answers to the 4 questions above are yes, for example if sequence numbers do not start from 0 or 1, or if the sequences have large gaps, then you would need to make more substantial changes to the module to map the sequence numbers to record numbers, which may be more work than you want to do, but it's worth considering if there is a algorithmic relationship involved.
|
|---|