hola;

from what i gather i think this is a case for good ol' Run Length Encoding (RLE):

here is RLE for an array in a couple of lines of not very strict code.

the basic idea is to look for repeated runs on a single symbol in your array, and replace by a single instance of the symbol and a run count. great if your set of symbols (numbers, in this case) is small.

here goes. result stored in @runlengths as refs.

@test = (23,23,4,8,21,90,90,90,90,2,2,2,19,21,19); map { $length = ($test[$_ - 1] == $last)? $length + 1: 1; $run++ unless $test[$_ - 1] == $last; $last = $test[$_ - 1]; $runlengths[$run] = [$test[$_ - 1], $length]; } (1 .. scalar @test);
ok, so what does it look like?

using
@strings = map { $runlengths[$_][0] . "x" . $runlengths[$_][1] } ( 1 .. $#runlengths);
gives us the set of strings

("23x2","4x1","8x1","21x1" ... "19x1")

and i guess no compresssion routine is complete without an extraction function, which i have not optimized much here...
sub extract{ my $index = shift; my ($last, $lastindex); foreach (@runlengths[1 .. $#runlengths]){ ($last, $lastindex) = ($$_[0], $lastindex + $$_[1]); return $last if $index <= $lastindex - 1; } return undef; }
a final note: no discussion of compression and perl is complete without reference to the Mark Jason-Dominus article on Huffman encoding, which would grace even a royal toilet:

http://perl.plover.com/Huffman/huffman.html

that is meant as sincere flattery btw.

hope that helps

...wufnik

-- in the world of the mules there are no rules --

In reply to RLE for simple array compression by wufnik
in thread pack unpack charcount repetition by denthijs

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.