in reply to Re^2: How to efficently pack a string of 63 characters
in thread How to efficently pack a string of 63 characters

I think you might be misreading the glob.

The 5,2,1 maps 5 letters to one byte (look at the mapping hash with Data::Dump), so I'm getting a 5 to 1 reduction. (Ignoring the 2,1 which is just there for strings whose length is not a multiple of 5.)

I don't understand your use of 'chunks', or where you get 9 of them.

BTW: It's a 5 to 1 reduction independent of the redundancy in the string. Other compressors may use redundancy to do better. It's sort of a question of how random the letters really are.

  • Comment on Re^3: How to efficently pack a string of 63 characters

Replies are listed 'Best First'.
Re^4: How to efficently pack a string of 63 characters
by LanX (Saint) on Sep 10, 2021 at 07:45 UTC
    Yes I misread the glob like producing 9 runlength chunks 'AAAAA','AA','A',... and coding each as a byte.

    I agree that this 5 to 1 is almost optimal, if the alphabet is really random, i.e without redundancy.

    3**5=243 that means you are using 7.92 bits of the byte. Plus some more for smaller trailing chunks.

    That's very efficient. The theoretical optimum is at 37.5 bytes and you only need 39.

    But I think zip should do considerably better than 40% if this particular raw input was longer. (update: like proven here)

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery