in reply to Re^7: a random_data() implementation
in thread How to efficently pack a string of 63 characters

If I understand you correctly: yes, the 3-line data contained things like AAAAAAA, that was picked up by my generator and produced lines like ACAAAABBBCCCAAAABBBBCCAAABCAAAAAAABBBBBBBBBBBBBCAAAACAAABCAABCA

$data = join "", @data, @$random_data; # remove \n they can be re-ins +erted later # Add this to print all data print join("\n", @$random_data);

Replies are listed 'Best First'.
Re^9: a random_data() implementation
by LanX (Saint) on Sep 11, 2021 at 14:31 UTC
    Thanks!

    I looked into it and could reproduce your results.

    FWIW I tried best compression for gzip

    use IO::Compress::Gzip qw(gzip :constants); sub compgzip { gzip \(shift) => \(my $output), -Level => Z_BEST_COMPRESSION; $output; } use IO::Uncompress::Gunzip qw(gunzip); sub uncompgzip { gunzip \(shift) => \(my $output); $output; }

    and got

    ------------------------------ Compression by gzip/gunzip length of data 210168 length of compressed data 42210 compressed to 20.1% MATCH

    update

    I noticed that -Strategy => Z_RLE already led to compressed to 20.9% so my theory is that your runs are so homogeneously distributed that the second phase Huffmann couldn't squeeze more than 0.8% out of it.

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

Re^9: a random_data() implementation
by LanX (Saint) on Sep 10, 2021 at 20:51 UTC
    OK, strange ...

    .. I would have expected zip to perform better then.

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery