in reply to How to save memory, parsing a big file.
The first thing you can do is try to rewrite your algorithm to process the input as sequentially as possible, I can not help you here too much because I don't fully understand what you want to get as the final result of the processing, but if you would tell us about it...
You can also store the data on the hash packed using vec, both for the keys and the values. The key will consume 6 bytes and the value 10 bytes plus the scalar (SV) overhead (you don't need to store $source both in the key and on the value). That would probably reduce your memory requirements to 1/10.
Regarding the code you have posted, making the keys as "$source$dport" is ambiguos, for instance 10.0.1.10.1445 could be ....1, 445; ....14, 45; ....144, 5. Better you use $total{$source,$dport} that is equivalent to $total{"$source\0$dport"}.
Finally, you are using as keys only the source IP and the destination port for %total and storing the target IP. That doesn't make sense to me as generally, there could be different target IPs. For instance, an user (source IP) browsing the web (port 80) would be accessing several servers (different target IPs).
|
|---|