in reply to unpacking 6-bit values

BrowserUk:

Update 3: I can't make pack & unpack work. I think endianness is biting me. A simple conversion of the C program to perl works, but isn't as nice to use as pack/unpack would have been. My best perl attempt so far is:

# Roboticus' second attempt my $robo_2 = sub { my @orig = unpack "S*", shift; my $bits=0; my $buf=0; my @bytes; while (@orig) { $buf |= (shift @orig)<<$bits; $bits += 16; while ($bits>6) { push @bytes, $buf & 0x3f; $buf>>=6; $bits-=6; } } push @bytes, $buf if $bits; return (@bytes); };

When I was looking at pack trying to find a way to efficiently do the task, I noticed that it has a u flag for uuencoding! So just use that to do the job. You'll have to set the "length of data" byte at the front, but that should be trivial.

roboticus@Boink:~ $ cat 876421.pl #!/usr/bin/perl use strict; use warnings; my $orig = "Now is the time for all "; my $enc = pack "u", $orig; print "$enc\n"; my $dec = unpack "u", $enc; print "$dec\n"; my (undef, @bytes) = map { 0x3f & ord($_) } split //, $enc; printf "%02x ", $_ for @bytes; roboticus@Boink:~ $ perl 876421.pl 83F]W(&ES('1H92!T:6UE(&9O<B!A;&P@ Now is the time for all 33 06 1d 17 28 26 05 13 28 27 31 08 39 32 21 14 3a 36 15 05 28 26 39 0 +f 3c 02 21 01 3b 26 10 00 0a

...roboticus

When your only tool is a hammer, all problems look like your thumb.

UPDATE: The reason I was looking into pack was that I had a better C program that I thought might translate into a quick perl script. The newer C program is:

#include <stdio.h> unsigned char inbuf[]="Now is the time for all "; unsigned char outbuf[50]; void hexdump(unsigned char *p, int n) { for (int i=0; i<n; i++) { printf("%02x ", *p++); } puts("\n"); } int main(int, char **) { unsigned char *src = inbuf; unsigned char *srcEnd = src + sizeof(inbuf); unsigned char *dst = outbuf; hexdump(src, sizeof(inbuf)); int bits = 0; int buf = 0; while (src != srcEnd) { buf |= (*src++)<<bits; bits += 8; while (bits > 6) { *dst++ = buf & 0x3f; buf >>= 6; bits -= 6; } } hexdump(outbuf, dst-outbuf); }

Update 2: I forgot to remove the top two bits of the bytes to give you the values you wanted, and I also stripped out the first byte. To do so, I added the last two lines to the code (above) and I added the last line of output (above). Of course, now with the overhead of splitting the string back into individual characters, and transforming the bytes after the map may quite likely make the uuencoded version slower than some of the alternatives. Ah, well.

Replies are listed 'Best First'.
Re^2: unpacking 6-bit values
by BrowserUk (Patriarch) on Dec 10, 2010 at 15:19 UTC
    Update 3: I can't make pack & unpack work

    Phew! I thought I was going nuts :)

    From what I read, uuencode converts each 3-bytes to 4-bytes, which is the reverse of what I need.

    Then I thought maybe I could use the decode process to perform my encoding, but I couldn't make that work either :(

    But it sure did sound like a cool idea to start with.

      BrowserUk:

      Yep. I saw it, and then looked at Wikipedia to check out what it was, and they had a nice little example there. I was hoping it was going to be fast, so I was putting it in a benchmark script, but I then discovered the bad news.

      The benchmark script (such as it is):

      The benchmark script shows a little bit of a speed boost, but not what I was hoping for. I may have to try Inline::C to see what we can get. Note: The benchmark isn't useful yet, as I haven't necessarily plugged your or jethros code in properly. If I had more time to put into it, I'd've added some of the other solutions as well. But I had to go to work, and now I've got to feed my son and do some chores.

      roboticus@Boink:~ $ perl 876421.pl REF: 14 21 16 1a 33 01 12 1a 33 01 22 1b 2f 11 07 08 21 01 02 1a 21 11 + 27 0b jethro: matches robo_2: matches Rate BrowserUk jethro robo_2 BrowserUk 11876/s -- -6% -20% jethro 12642/s 6% -- -15% robo_2 14837/s 25% 17% --

      Thanks for coming up with a fun diversion for this morning!

      ...roboticus

      When your only tool is a hammer, all problems look like your thumb.

        Thanks for coming up with a fun diversion for this morning!

        T'is fun isn't it ;) But, it will also serve a very practical purpose, which (for me) makes it all the more fun. It always amazes me the diversity of approaches that the monks come up with for tackling seemingly quite tightly defined problems like this. That's why I posted it; and one of the very best things about this community.

        See also Re: unpacking 6-bit values for my benchmark of 4 pure perl solutions and 3 C routines, including one of yours. I would have posted it earlier, but it took a considerable amount of effort to ensure that all the routines actually produced the same output. That meant adapting almost of the posted code to some degree or another.

        jmacnamara's solution is hands down winner in the pure perl stakes, being roughly twice as fast as my original. But the C versions, which are all much of a muchness, despite being radically different in the way they achieve the solutions, are a good 5 times faster still. Had there been a directly usable pack/unpack version, I would have expected a far smaller differential, perhaps small enough to warrant avoiding moving to C; but as is, I think C will be necessary for this.

        Thank you (and everyone) for your time and effort, it helped me a lot.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.