in reply to Alternatives to DB for comparable lists
If I understand correctly, you wish to obtain the following for each file:
Where the files are distributed over six servers.
This probably depends upon how you are planning to collect all the data, but my personal approach would be to have a small script running on each of the six servers performing the hashing and sending each result back to a common collector. This assumes network connectivity.
I think it would be relatively easy to calculate the tuple of the five items for each server with a script and issue them over the network back to a central collection script. Each server can be hashing and issuing results simultaneously to the same collector.
While there may be a lot of data to hash, the actual results are going to be small. Therefore, as you know exactly what you are obtaining (the five items of data) I would just go the easiest route and throw them in a table in DBD::SQLite.
Then, once you have all the data in your DB, you can perform offline analysis as much as you want, relatively cheaply.
As a side note, I'd probably go with SHA-256 rather than MD5 as MD5 collisions are more common, and it's not that much more computationally expensive.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^2: Alternatives to DB for comparable lists
by cavac (Prior) on May 16, 2018 at 11:59 UTC |