in reply to Re^2: modification of the script to consume less memory with higher speed
in thread modification of the script to consume less memory with higher speed
You appear to keep the first record that is seen, in full, while subsequent matching records are only tallied by their count. In that right?
Now, the remaining question is, do you want the output records to keep the order in which they are processed, or is it acceptable if they appear in random order?
If any output order will do, then the simplest way to process your job is to divide it up in parts. For example, you can dump the records in temporary files, according to first few letters of the key. Lets say, the intermediate files are TACA.tmp, CATT.tmp, AGAT.tmp, etc. After that, process each temp. file individually, appending the output to final result. Questions?
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: modification of the script to consume less memory with higher speed
by Anonymous Monk on Jul 30, 2016 at 06:22 UTC | |
by Anonymous Monk on Jul 30, 2016 at 07:46 UTC |