jhanna has asked for the wisdom of the Perl Monks concerning the following question:

O enlightened ones: I have many 12 bit ints which I want to output to a file. I have a way which works but is misserable in its ugliness. Who has a beautiful way to mix two 12 bit ints into three chars and then to read them back? Here's my ugly-bags-of-mostly-water code:
sub c_out { my $_=shift; my $final=shift; if(defined($c_out_store)) { print F2 pack('H*',sprintf('%03x%03x',$_,$c_out_store)); undef $c_out_store; } elsif($final) { print F2 pack('H*',sprintf('%03x%03x',0,$_)); } else { $c_out_store=$_; } }
and then to read:
while($usedouble || !eof($file_in)) { if($usedouble) { $new_code=$double; $usedouble=0; } else { $r=read($file_in, $double, 3); ($double)=unpack(N,"\000$double"); $new_code=$double & 0xfff; $double >>= 12; $usedouble=1; } # etc }

Replies are listed 'Best First'.
(tye)Re: 12 bit ints pairs into 3 bytes
by tye (Sage) on Mar 01, 2001 at 03:18 UTC

    Going with a little-endian order:

    { my %prev; sub c_out { my( $fh, $c )= @_; if( ! defined $c ) { print $fh pack "CC", $c & 0xff, $c >> 8 if exists $prev{$fh}; } elsif( exists $prev{$fh} ) { print $fh pack "CC", $prev{$fh} | (($c&0xf)<<4), $c>>4; } else { print $fh pack "C", $c & 0xff; $prev{$fh}= ( $c>>8 ) & 0xf; return; } delete $prev{$fh}; } } { my %prev; sub c_in { my( $fh )= @_; my $c; if( exists $prev{$fh} ) { if( ! read( $fh, $c, 1 ) ) { die "Invalid trailing nybble ($prev{$fh})" unless 0 == delete $prev{$fh}; return; } return delete($prev{$fh})<<8 | unpack("C",$c); } else { my $len= read( $fh, $c, 2 ); die "Extra trailing byte (",unpack("C",$c),")" if 1 == $len; return if 0 == $len; ( $c, $prev{$fh} )= unpack "CC", $c; $c |= ($prev{$fh}&0xf)<<8; $prev{$fh} >>= 4; return $c; } } }

    I have you pass the file handle to the routine and don't handle bareword file handles so use c_out(\*FILE,$sesqui) and finish up by doing c_out(\*FILE) to flush the final nybble, if any. (Also, don't intermix *FILE and \*FILE as I didn't bother to make the code robust in the face of that either.)

            - tye (but my friends call me "Tye")
      This indeed looks (a) more robust (ie handling a variety of output files at once) and (b) more efficient (I hated to use sprintf). Thanks! I am still wondering, though, if it could be done in fewer lines. Somewhere between pack and vec...

        Well, you can simplify things greatly by working with two items at a time. But that just forces more complexity into the caller:

        sub c2_out { my( $fh, $x, $y )= @_; if( defined $y ) { print $fh pack "C3", $x & 0xff, ($x>>8) | ($y&0xf)<<4, $y>>4; } else { print $fh pack "CC", $x & 0xff, $x>>8; } } sub c2_in { my( $fh )= @_; my $len= read( $fh, $c, 3 ); return if ! $len; die "Extra trailing byte (",unpack("C",$c),")" if 1 == $len; my( $x, $y, $z )= unpack "C*", $c; $x |= ($y&0xf)<<8; return $x if ! defined $z; return( $x, $z<<4 | $y>>4 ); }

        Or, dropping the file handle support:

        { my @nybbles; sub c_out { push @nybbles, 0 if ! @_; @_= ( @nybbles, map { $_&0xf } map { $_, $_>>4, $_>>8 } @_ ); print OUT pack "C", shift|(shift()<<4) while 1 < @_; @nybbles= @_; } } { my @n; sub c_in { do { return shift(@n) | shift(@n)<<4 | shift(@n)<<8 if 2 < @n; push @n, map { $_&0xf, $_>>4 } unpack "C*", <IN>; } while( 1 < @n ); return; } }

        As usual, the emphasis seems to be on "sexy" code rather than "good" code. ):

        Update: The above code has now been tested. I've come to like the map version and would probably modify it to make it robust and flexible (which would up the number of lines, against the point of this node) if I needed this functionality.

                - tye (but my friends call me "Tye")
      Here's what I decided on:
      if(defined($o)) { $o=substr(pack("N",$o | ($t{$string}<<12)),1,3); print F2 $o; undef $o; } else { $o=$t{$string}; }
      I thought about "L" and chop, but thought N might be more portable. And for unpack:
      $r=read(F1, $double, 3); ($double)=unpack(N,"\000$double"); $old_code=$double & 0xfff; $double >>= 12;

        Nice. "V" is as portable as "N" and would let you use chop, BTW.

                - tye (but my friends call me "Tye")