in reply to Re^2: how to merge many files of sorted hashes?
in thread how to merge many files of sorted hashes?
But it doesn't matter. For every single key you are doing a linear search through every file. So that's (number of keys) * 100,000,000 matrices to wade through; no wonder this is slow.
You need to be writing each %small_hash in sorted order. Sorting 1000 small_hashes will be faster than sorting the big hash (1000*(n/1000)*log(n/1000) = n log(n/1000) < n log(n) but the really important thing is this will enable the merge I was talking about before, which only reads each file once rather than once for every key value.
|
|---|