in reply to Compressing a set of integers
May I point you to the fact that there is an option in pack/unpack to store numbers with built in compression? They're called "BER-compressed integers" (see perlfunc:pack — use the "w" template), and they use a few bytes as possible. They're slightly less efficient in storage on the maximum number, as only 7 bits per byte are significant. But on the plus side: they don't waste any bytes they don't need, so for not so extreme values, the waste is no more than with the fixed length representation, and often less.
Experimentally I've determined the following ranges for the byte count:
range | byte count |
---|---|
0 - 127 | 1 |
128 - 16383 | 2 |
16384 - 2097151 | 3 |
2097152 - 268435455 | 4 |
268435456 - 4294967295 (?) | 5 |
That problem aside, here's an extra way you can save on space: sort the numbers in ascending order, and store the difference between successing numbers — the first value being the smallest number itself, the difference from zero. The more numbers you have, the closer they'll be together, and that way, you might shave of a few bytes.
|
---|