comment on

> I'm pretty sure that zip has some fix overhead which doesn't pay off with just 189 bytes input.

to prove my point, here an altered version of your code which fakes longer data by rotating the original input and showing zip at 5% compression. That's factor 4 better than your champion.

(Of course is rotating kind of biased, because it keeps most run length chunks intact and zip will efficiently Huffmann all Runs it finds.° But it's up to the OP to provide unbiased data, I'm no psychic... ;-)

FWIW: zip is already at 18% after only tripling the input.

use strict;                  # https://perlmonks.org/?node_id=11136582
use warnings;                # originally https://perlmonks.org/?displ
+aytype=displaycode;node_id=11136614;abspart=1;part=1
use feature 'say';



for my $try (
             [ 'gzip/gunzip', \&compgzip, \&uncompgzip ],
             [ '2 bit code, 6 bit runlength', \&comp62, \&uncomp62 ],
             [ '2 bits per letter', \&comp2bits, \&uncomp2bits ],
             [ 'groups of 5,2,1', \&comp5, \&uncomp5 ],
            ) {
    my ($method, $comp, $uncomp) = @$try;
    print "\n------------------------------ Compression by $method\n\n
+";
    my $data = <<END;
ABBCBCAAAAABBCBCACCCAAAAACAAAAABBBBBAAAAABBAAAAAAAABBCCCACCAABC
BCCCBCAACAABBBCAAACCAAAAACAAAAABBBBBAAAAABBAAAAAAAABBCCCACCABBC
ABCCBBBAAAABBABCACABCCCCCCAAAAABBCBBCCCCAAAAAAAAAAAAACCCACCACCC
END

    my @data = split/\n/, $data;

    push @data, map{rotate($data[$_])} 0..2 for 1..100;      # fake x1
+00 times data

    $data = join "", @data;  # remove \n they can be re-inserted later
    print "length of            data @{[ $data =~ tr/ABC// ]}\n";
    my $compressed = $comp->($data);
    print "length of compressed data @{[ length $compressed ]}\n";
    #use Data::Dump 'dd'; dd $compressed;
    # print unpack('H*', $compressed), "\n";
    my $uncompressed = $uncomp->($compressed);
    printf "compressed to %.1f%%\n",
      100 * length($compressed) / length $uncompressed;
    print $data eq $uncompressed ? "MATCH" : "************ no MATCH", 
+"\n";
}

# by lanx
sub rotate {                            # fake data from original samp
+le
    # say "O: ",
      my $orig =shift;

    # say
      my $rnd  = int rand 63;
    my $head = substr $orig,0,$rnd,"";
    # say "N: ",
      my $new = $orig.$head;
    return $new;
}


# compress by groups of 5,2,1 to single letter

sub comp5
  {
      my @code = map glob('{A,B,C}' x $_), 5, 2, 1;
      my %code;
      @code{@code} = map chr, 1 .. @code;
      local $" = '|';
      shift =~ s/(@code)/$code{$1}/gr
  }
sub uncomp5
  {
      my @code = map glob('{A,B,C}' x $_), 5, 2, 1;
      my %code;
      @code{map chr, 1 .. @code} = @code;
      join '', @code{split //, shift};
  }

# compress by lower two bits of letter

sub comp2bits
  {
      my ($ans, $n) = ('', 0);
      vec($ans, $n++, 2) = 3 & ord $_ for split //, shift;
      $ans;
  }
sub uncomp2bits
  {
      my $comp = shift;
      join '', map
        {
            ('', 'A', 'B', 'C')[ vec $comp, $_, 2]
        } 0 .. -1 + 4 * length $comp;
  }

# compress by runlength or 6 bits length and 2 bits letter code

sub comp62
  {
      shift =~ s/([ABC])\1{0,62}/ chr( length($&) << 2 | ord($1) & 3) 
+/ger;
  }
sub uncomp62
  {
      shift =~ s/./ (($& & "\3") | '@') x (ord($&) >> 2) /gesr;
  }

# compress by gzip

use IO::Compress::Gzip qw(gzip);
sub compgzip
  {
      gzip \(shift) => \(my $output);
      $output;
  }
use IO::Uncompress::Gunzip qw(gunzip);
sub uncompgzip
  {
      gunzip \(shift) => \(my $output);
      $output;
  }
[download]

-*- mode: compilation; default-directory: "d:/tmp/pm/" -*- Compilation started at Fri Sep 10 13:25:18 C:/Strawberry/perl/bin\perl.exe -w d:/tmp/pm/pack_63_chars.pl ------------------------------ Compression by gzip/gunzip length of data 19089 length of compressed data 1021 compressed to 5.3% MATCH ------------------------------ Compression by 2 bit code, 6 bit runlen +gth length of data 19089 length of compressed data 7714 compressed to 40.4% MATCH ------------------------------ Compression by 2 bits per letter length of data 19089 length of compressed data 4773 compressed to 25.0% MATCH ------------------------------ Compression by groups of 5,2,1 length of data 19089 length of compressed data 3819 compressed to 20.0% MATCH Compilation finished at Fri Sep 10 13:25:18
[download]

Cheers Rolf
_{(addicted to the Perl Programming Language :)

Wikisyntax for the Monastery}

°) even after constantly reversing one part of the input I'm at 9% compression for factor 100 input.

In reply to Re^3: How to efficently pack a string of 63 characters (longer input) by LanX
in thread How to efficently pack a string of 63 characters by baxy77bax

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.