eff_i_g,
You really haven't said anything at all about how the program works or how it decides what data it needs and when.
Since many of your fields are not needed, they need not be included in your data structure provided you can no in advance that they won't be needed. If only 1 id is ever worked with at a time, then there is not a need to ever load more than one record in memory at a time. Alternatively, it may be possible to employ a MRU cache such that the splits are cached in arrays but only a fixed number are cached where the most recently used stay in cache and others expire.
Try to put yourself in my shoes. Read what you have written about your program, your datastructure, and your problem and see if you feel you have provided the necessary information to help. Again, we are just guessing.
| [reply] |
Limbic,
I apologize; I'm trying :) This is a little challenging since I am also learning.
The basic programming process is explained in my reply to BrowerUk. It's that simple, but it deals with a lot of information. The problem is with step 2 because it hashes all of the data provided, when the script may only need a fraction of it.
To reiterate: Correct. The whole lookup is not needed for processing. The pins that are needed could be determined by reading all of the pins in the source file; the largest one is around 25MB, 40,000 lines. If the file was only using the pins 123, 456, and 789, I could only look for these in the other file to hash.
| [reply] |
eff_i_g,
I am afraid after reading your reply to BrowserUk, I am still left wondering about how the program works. You speak in terms as though we understand what you are talking about. What do you mean by section and how is it determined? This isn't really a question I want you to answer because I am sure it will just lead to more questions.
I am afraid you just aren't providing the technical details necessary to help. I believe the only way that I personally am going to be able to help is if you were to provide a sample of the data (masking sensitive info is fine but it must be representative of the real data), the code that is processing it, and an example of how it is invoked.
| [reply] |