in reply to Re: Data compression by 50% + : is it possible?
in thread Data compression by 50% + : is it possible?

Many people in this thread miss crucial information already given in the OP's text!

(apart from reading the explicit example code given)

> - order needs not to be preserved

> - occur only once in a given line.

> - They cannot be consecutive (meaning there is no sequence in a dataset).

I.e. tuples like (3,1, ...), (1,1,...) or (1,2, ...) are impossible. (see OPs if condition)

But the OP's format is obviously highly redundant, he's not only

Alone the last point leaves sufficient room for compression far beyond near 50%.

Roboticus and I already elaborated this explicitly by demonstrating all possible independent tuples and pointing to their near optimal compression using Huffman coding.

Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery FootballPerl is like chess, only without the dice

  • Comment on Re^2: Data compression by 50% + : is it possible?