in reply to Re^2: Data compression by 50% + : is it possible?
in thread Data compression by 50% + : is it possible?

LanX:

You're not missing anything that I know of. What I was basing my "not quite" phrasing on is the idea of using a single character to encode each group (@c) into a character, so it would use 9 characters (72) bits. Had I thought of just packing the required 51 bit records together, it would be more than sufficient to get 50% compression, as the file would take 635 bytes to encode 100 records (sans newlines).

(The 51 bits came from: 50 different possibilities for each group of 10 in the inner loop (log2(50) == 5.64.. bits/group) * (9 groups) == 50.79 bits.)

...roboticus

When your only tool is a hammer, all problems look like your thumb.

  • Comment on Re^3: Data compression by 50% + : is it possible?

Replies are listed 'Best First'.
Re^4: Data compression by 50% + : is it possible?
by LanX (Saint) on May 12, 2019 at 12:32 UTC
    >  using a single character to encode each group (@c) into a character, so it would use 9 characters (72) bits

    Oh I see, but I hope you are aware that your approach can easily be packed into 9*6=54 bits and is easier to code than mine.

    With a per line ratio of 7 bytes = 56 bits you'll already have a 50%+ compression.

    My approach would require modulo 50 calculations on 51 bit integers, not sure how tricky this is on a 32 bit machine.

    So I'd rather "waste" 5 bits/line for a pragmatic solution.

    Btw: I'm reluctant trying to implement Huffman here, because the OP could probably change the parameters in his next post.

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery FootballPerl is like chess, only without the dice