in reply to Re: pack unpack charcount repetition
in thread pack unpack charcount repetition

I am sorry, I posted this at 5 am or so my time and had not had any redbull yet =) here is what i meant to say: If (0,28,48,54,60,62,76,126) = (0000,1000,0100,1100,0010,1010,1110,0001) then you can encode the original array as:
[0000,1100,1000,1110,0001,0000,1000,1100,1010,1100,0000,1000,1100,0010 +,0100,0000,1100,1100,1010,1100,0000]
or as bits:
0000110010001110000100001000110010101100000010001100001001000000110011 +00101011000000
then convert to 12 bytes (filling the last 4 bits with 1111 to tag padding). to get the data back you need to chomp 4 bits per array slot until done or you hit 1111
0,54,28,76,126,0,28,54,62,54,0,28,54,60,48,0,54,54,62,54,0
stored in a file is 58 bytes. As long as the set of unique numbers stays fixed or less than 15 you can save space storing in this way and the larger the array the more savings you will see.

it was really silly anyways because you will note for such a small set the perl code to compress/decompress + the data compressed is larger than the starting data =)

-Waswas

Replies are listed 'Best First'.
Re: Re: Re: pack unpack charcount repetition
by denthijs (Acolyte) on Jun 10, 2003 at 09:23 UTC
    my dataset really was chosen awefully it seems, ..
    still, this could be very handy if i werent to encode yaph(248131) but the full Yet Another Perl Hacker in the future.
    or countless other things, i'm sure ;)
    *less is more* :)