in reply to Re: Memory utilization and hashes
in thread Memory utilization and hashes
Turns out that the unix sort was exactly the prior step that was missing to help speed this up. With a correct choice of keys, the file now is in sequential order by "ID" and when a new Query comes in, it is now easy to check if the current "ID" = the prior "ID" and flush any accumulated hash entries and continue. This keeps the hash to, in testing, no more than 3-7 'extra' keys for each set of "ID"s in the file and then dumps the set.
Memory usage has stayed small and the processing is now approx 1/4 the total time of the prior runs.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^3: Memory utilization and hashes
by poj (Abbot) on Jan 19, 2018 at 13:34 UTC | |
by bfdi533 (Friar) on Jan 25, 2018 at 20:30 UTC | |
|
Re^3: Memory utilization and hashes
by bfdi533 (Friar) on Jan 18, 2018 at 23:46 UTC |