The 50% compression rate is possible. More is possible. Depends on the arrangement of input data and the algorithm you're using!

Here's an idea: Take the first bit of each number and create a list of numbers from that. See if you can compress that list at a better rate. Take the second bit of each input number and create another list, and so on... If your numbers are all even-odd-even-odd or all odd or all even numbers, then this method will help.

If the input numbers are totally random, I would still not give up just yet! I'd generate a list of "random" numbers and XOR the input numbers with randoms to get a new list that has a better chance of being compressed. Try that!

Most programs generate random numbers this way:

FOR LOOP:
   S = (S * A + B) % C
   print "Random number: ", S
END FOR

S is the initial seed for the random number generator. Programs usually set this to the number of milliseconds since 1970. A, B, and C are constants that can be any random value. In many programming languages the builtin random() function usually returns a number between 0 and 1. And in order to get that, C must be 1. If you repeat this calculation over and over again, you get a list of numbers that seems quite random.

By modifying the values of S, A, B, or C even slightly, you get a totally different series of numbers! If, let's say, A is 13.4849927107, and you just change one digit, you will get a totally different list of numbers that does not resemble the previous set at all. So, you could initialize these constants and then get a random list. Take two random lists and either ADD the values or XOR them or whatever. The resulting list MIGHT HAVE more order than your input data set! And this can help you compress the list further.

I've done this with ZIP files... You know, when you compress a ZIP file and you compress it again and again, you reach a limit after which the size starts growing instead of shrinking! But if, at some point, you encode the ZIP file using a list of random numbers, you can sometimes ZIP it again further and get an even smaller file! ;-)


In reply to Re: Data compression by 50% + : is it possible? by harangzsolt33
in thread Data compression by 50% + : is it possible? by baxy77bax

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.