|Just another Perl shrine|
Memory efficient way to deal with really large arrays?by sectokia (Scribe)
|on Dec 12, 2020 at 23:42 UTC||Need Help??|
sectokia has asked for the wisdom of the Perl Monks concerning the following question:
I have input which is 500M pairs of 32bit integers, which don't occur more than 256 times each.
I want to load them into memory, and also have an array of how many times each int is in a pair.
Now in C, to do this I only need 5GB of ram: For the pairs: 500M * 4 (32bit int) * 2 (pairs) = 4GB. For the occurances: 1G * 1 (8bit int) = 1GB.
However when I do the same thing in perl, the ram usage is more like 256 bytes per item:
I am seeing about 1GB RAM usage per 4M input pairs, so I would need 125G of RAM!
Is there any way you can tell perl a scalar is to be a int only of a certain size?
The other idea I have is that I should not be using array, but rather a gigantic scalar, and then pack/unpack the values in/out of the scalar?
What would be the best way to approach this in perl?