Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re^2: Efficient 7bit compression

by Limbic~Region (Chancellor)
on Mar 14, 2005 at 15:52 UTC ( #439321=note: print w/replies, xml ) Need Help??


in reply to Re: Efficient 7bit compression
in thread Efficient 7bit compression

Roy Johnson,
This looks similar to my unshown trading memory for time solution. The $bitstr will temporarily be 8 times larger than the original $str. I outlined 2 kinds of efficiency in my post (RAM & runtime). Now if there was a way to have our cake and eat it too.
Thanks!

Cheers - L~R

Replies are listed 'Best First'.
Re^3: Efficient 7bit compression
by Roy Johnson (Monsignor) on Mar 14, 2005 at 16:05 UTC
    This has constant overhead by expanding only 8 chars at a time.
    sub shrink { my $packed = ''; while ($_[0] =~ /(.{1,8})/gs) { my $bitstr = unpack('B*', $1); $bitstr =~ s/.(.{7})/$1/g; $packed .= pack('B*', $bitstr); } $packed; } sub grow { my $unpacked = ''; while ($_[0] =~ /(.{1,7})/gs) { my $bitstr = unpack('B*', $1); $bitstr =~ s/(.{7})/0$1/g; $unpacked .= pack('B*', $bitstr); } $unpacked; }

    Caution: Contents may have been coded under pressure.
      Roy Johnson,
      I tried a similar approach only using vec() and starting 1 bit back from the end of the string each time - I got bizarre results. An idea I had that I would have pursued if I could have made it work would be to pre-size the strings and use 4arg substr instead of growing it each time.

      Cheers - L~R

        Are you talking about shrinking in place, so you don't have the overhead of having the compressed and uncompressed strings in memory at the same time?
        my $str =<<'EOS'; There once was a man from Nantucket, Who kept all his cash in a bucket. His daughter, named Nan, Ran off with a man. And as for the bucket, Nantucket. EOS sub shrink_in_place { for (my $i=0; $i<length($_[0]); $i+=7) { for (substr($_[0], $i, 8)) { $_ = pack('B*', grep s/.(.{7})/$1/g, unpack('B*', $_)); } } } sub grow_in_place { for (my $i=0; $i<length($_[0]); $i+=8) { for (substr($_[0], $i, 7)) { $_ = pack('B*', grep s/(.{7})/0$1/g, unpack('B*', $_)); } } } print "$str"; printf "Length is %d\n", length($str); shrink_in_place($str); printf "Shrunk length is %d\n", length($str); grow_in_place($str); print "Restored: $str";

        Caution: Contents may have been coded under pressure.
Re^3: Efficient 7bit compression
by Ven'Tatsu (Deacon) on Mar 14, 2005 at 19:43 UTC
    I'm not sure how this fares in reguard to speed, but it only uses 2-5 scalars, tries to avoid generating intermidiate lists, and modifies their arguments inplace.
    sub shrink { my $dbit = 0; for (my $sbit = 0; $sbit < length($_[0]) * 8; $sbit++) { next if $sbit % 8 == 7; vec($_[0], $dbit++, 1) = vec($_[0], $sbit, 1); } my $dlen = length($_[0]) * 7 / 8; $dlen++ unless $dlen == int($dlen); $dlen = int($dlen); my $extra = length($_[0]) - $dlen; if ($extra > 0) { substr($_[0], $dlen, $extra, ''); } for (my $pbit = $dbit; $pbit < $dlen * 8; $pbit++) { vec($_[0], $pbit, 1) = 0; } } sub grow { my $sbit = int(length($_[0]) * 8 / 7) * 7 - 1; for (my $dbit = int(length($_[0]) * 8 / 7) * 8 - 1; $dbit >= 0; $d +bit--) { vec($_[0], $dbit, 1) = $dbit % 8 == 7 ? 0 : vec($_[0], $sbit-- +, 1); } }
    I have the nagging fealing there is a better way to implement ceil (near the middle of shrink).
    If padding the compressed string with 0 bits is not needed then the second for loop of shrink can be omitted to save (on average) 4 bits of time.
      ...a better way to implement ceil...
      my $dlen = int((7 + length($_[0]) * 7)/8); substr($_[0], $dlen) = ''; vec ($_[0], $dbit++, 1) = 0 while ($dbit < $dlen * 8); }
      This replaces your shrink function from the definition of $dlen on.

      Caution: Contents may have been coded under pressure.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://439321]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (2)
As of 2022-10-02 10:13 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My preferred way to holiday/vacation is:











    Results (8 votes). Check out past polls.

    Notices?