in reply to having fun with RE - was: Re: One Zero variants_without_repetition
in thread One Zero variants_without_repetition

You couldn't have been more understanding :)
that the shortest and the code the strikes right in the dot.
though i feel extremly sad to agree with ohcamacj and admit my failure that i won't live long enough to see it finish with that much ones and zeroes.

originally i was trying to make an de/coder that reads some bytes from a file, for every 26 bytes (at least) counts how many zeroes and one are there and the MD5 of the original binary string( of 26*8 bits) and writes it to a new file in the format of (for every previouse 26 bytes) "$ones,$zeroes,MD5x16\n"
then when it should decode the new file, it reads every string, checks for all the possibilities of strings containing these numbers of 1's and 0's checking their MD5 comparing it to the read one, if it fits it writes the original file by printing the ord('B8', $every_8_ones_or_zeroes_after_split) .

but now i understand i'll wait forever to decode few bytes.

P.S.: the beauty of such a compression, is first that it's a some sort of logic interpretation of almost random strings( of 1/0), and secondly, i can compress the compressed file until i reach it's minimal length (<= 26).
  • Comment on Re: having fun with RE - was: Re: One Zero variants_without_repetition

Replies are listed 'Best First'.
Re^2: having fun with RE - was: Re: One Zero variants_without_repetition
by oha (Friar) on Aug 09, 2007 at 09:50 UTC
    first, my regex is not perfect, there are ways to make it faster (making it greedy and starting only from start is a good start). but anyway it's slow.

    regarding what you are going to do: first you want to use a 16bit MD5 and the count of ones and zeros. the worst case is having all 26 ones or zeroes, so you need 5 bits for that information: that mean for 26 bit of data, you'll get 5+16 bit result. that's about 20% compression.

    unfortunately, you can't guarantee that for a given MD5 and number of ones, you'll have only 1 possibile 26bit data. you could analize it and findout how many case you can have at worst and i fear it's more then 32 (if it was 32, you had need another 5bit and the total of data would be 26)