Re: Compressing a text file using count of continous characters

My guess is you'll need to decode it at some point too. This all very similar to run length encoding, though I think RLE encodes even single character occurrences. So here's short snippet to encode/decode which you can modify to suit your needs:

use strict;

sub encode {
    s/((.)\2+)/(length $1) . $2/eg;
    $_;
}

sub decode {
    $_ = shift;
    my @list;
    while (/((\d+)?(.))/g) {
        push @list, [$2,$3];
    }
    join '', map { (defined $_->[0]) ? $_->[1] x $_->[0] : $_->[1]; } 
+@list;
}

while (<DATA>) {
    print;
    my $enc = encode($_);
    my $dec = decode($enc);
    print $enc;
    print $dec;
}

__DATA__
XYZAAAAAAAADEFAAcdAA
[download]

Which gives the following output:

XYZAAAAAAAADEFAAcdAA

XYZA8DEFA2cdA2

XYZAAAAAAAADEFAAcdAA
[download]

---
s;;:<).>|\;\;_>?\\^0<|=!]=,|{\$/.'>|<?.|/"&?=#!>%\$|#/\$%{};;y;,'} -/:-@[-`{-};,'}`-{/" -;;s;;$_;see;
Warning: Any code posted by tuxz0r is untested, unless otherwise stated, and is used at your own risk.

Comment on Re: Compressing a text file using count of continous characters Select or Download Code

Replies are listed 'Best First'.
Re^2: Compressing a text file using count of continous characters by educated_foo (Vicar) on Dec 14, 2007 at 20:48 UTC
You should at least make it symmetric... `sub encode { $_ = shift; s/((\D)\2+)/length($1).$2/eg; $_ } sub decode { $_ = shift; s/(\d+)(\D)/$2 x $1/eg; $_ }` [download]	[reply] [d/l]