in reply to Alternatives to DB for comparable lists
One approach might be:
Perhaps sending the batch-lines to STDOUT is the easiest approach where the tool could even be invoked by an ssh-command issued on the collection host? That also eliminates the requirement for DB drivers on the host to be scanned.
Use a header/trailer or checksum to assert completeness/integrity of the chunk of lines transmitted and perhaps also add some interesting meta-data (creation time, IP, etc.).
Update:
Oh, you asked for DB-alternatives... Rough estimation: 750k entries with a mean entry size of ca. 500 bytes results in a total size of approx. 375 MB. My experiment with Storable resulted in a file of size 415 MB. Reading/writing took ca. 2.0/3.5s on a moderate PC (3GHz, SSD).
Merging and storing all data into a native Perl data structure and using Storable for persistence looks feasible. PRO: fast speed for analytics; CON: no luxury that comes with a DB.
|
|---|