infi has asked for the wisdom of the Perl Monks concerning the following question:

Good evening, monks,

I have run into an issue where automagic type conversions have become a bane rather than a boon.. though I am definitely not ruling out a PEBCAK error.

I am designing a binary network protocol, and went to write a prototype implementation in my esteemed quick prototyping language, thanks to IO::Socket.

In any case, the protocol includes an error checking routine, that should theoretically be resistant to long burst errors (as errors usually come in droves). The theory is that by taking a squarable sequence of bits (8 bytes = 8x8 bit matrix, 32 bytes = 16x16 bit matrix, etc.), a parity bit can be added to each block, and then the bit matrix can be transposed at the source end, and retransposed at the destination end. This will create a situation where bursts of errors, proportional to the size of the transposed matrix, can be automatically corrected without resending the data.

If your head didn't already explode, the direct perl issue at hand is: perl's weak typing is turning my '00000000' into ascii '0' (00110000), and so forth. I am aware that I can accomplish this with pack/unpack, but this has the side-effect of increasing the memory required eight-fold, as well as reducing throughput speed, which, in the position of a potentially high-capacity network filter layer, simply is not feasible.

Here is a visualization of the method, using a simple 8x8 bit matrix:

input: desired output: scalar with this scalar with this bit pattern: bit pattern ('ABCDEFGH') 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 1 0 1 1 1 1 1 1 1 1 1 2 0 1 0 0 0 0 1 1 2 0 0 0 0 0 0 0 0 3 0 1 0 0 0 1 0 0 3 0 0 0 0 0 0 0 0 4 0 1 0 0 0 1 0 1 4 0 0 0 0 0 0 0 1 5 0 1 0 0 0 1 1 0 5 0 0 0 1 1 1 1 0 6 0 1 0 0 0 1 1 1 6 0 1 1 0 0 1 1 0 7 0 1 0 0 1 0 0 0 7 1 0 1 0 1 0 1 0

(Note: I am skipping the parity code for now, as this is easily done and irrelevant to the problem at hand)

This is perl5.8.1 on Linux 2.4.x. The source data file starts with ABCDEFGH, which *happen* to be ascii, but the point is that it is in fact arbitrary binary data and must be representable and treated as such, regardless of the content of the actual bytes. I have also tried the Bit::Vector and Math::MatrixBool modules, with similar results; the type of the $scalar, should it happen to be representable in ASCII, is trampled upon.

#!/usr/bin/perl use strict; use warnings; my $BUFSIZE = 8; my $data; # read the data file open(FH,"chargen") or die; binmode(FH,":raw"); read(FH,$data,$BUFSIZE); # $data contains 'ABCDEFGH' close(FH); print "Source:\n"; foreach (split //, $data) { to_binary($_); } $data = transpose($data); print "Destination:\n"; # --> cries foul in the sub for the next line foreach (split //, $data) { to_binary($_); } sub transpose { my $buf = shift; return if (length($buf) < $BUFSIZE); ($buf) = substr($buf,0,$BUFSIZE) if (length($buf) > $BUFSIZE); my @in = unpack("C*",$buf); # get only ascii vals my @out = (0,0,0,0,0,0,0,0); my $newrow = 0; for my $row (reverse 0..($BUFSIZE-1)) { for my $col (0..($BUFSIZE-1)) { $out[$newrow] |= (1 << $col) if ($in[7 - $col] >> $row & 1); } $newrow++; } return pack("C*",@out); } sub to_binary { my $byte = shift; ($byte) = split //, $byte if (length($byte) > 1); for (reverse 0..($BUFSIZE-1)) { my $b = ord($byte); print $b >> $_ & 1 ? "1" : "0"; } print "\n"; }

Time was an issue, and I have already written a reference implementation in C that met with great success. However, the question still remained in the back of my mind, nagging at me like a forgotten birthday. Can perl's weakly typed scalars and automatic type conversions accomplish this task without trampling my values and having to resort to bitstrings?

Thank you,
Jason

2004-12-15 Janitored by Arunbear - added readmore tags, as per Monastery guidelines

Replies are listed 'Best First'.
Re: perl bit matrix transposition / weak typing problem
by BrowserUk (Patriarch) on Dec 13, 2004 at 12:21 UTC

    I think that you are going about this in a too C-ish manner. Whenever I encounter a problem where Perl seems to be getting in the way, I tend to find that I'm using the wrong method and by looking around, there is a perlish method or function that addresses the problem.

    In this case, you need vec.

    #! perl -slw use strict; sub xformBits { my( $in ) = shift; my $out = chr(0) x length( $in ); vec($out,$_,1) = vec($in,7-int($_/8)+8*(7-$_%8),1) for 0..63; return $out; } my $example = 'ABCDEFGH'; print "Input: $example"; print join ' ', split'', unpack 'B8', $_ for split '', $example; my $xformed = xformBits $example; print "\nOutput: $xformed"; print join ' ', split'', unpack 'B8', $_ for split '', $xformed; __END__ [12:12:24.25] P:\test>414344 Input: ABCDEFGH 0 1 0 0 0 0 0 1 0 1 0 0 0 0 1 0 0 1 0 0 0 0 1 1 0 1 0 0 0 1 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 1 0 0 1 0 0 0 1 1 1 0 1 0 0 1 0 0 0 Output:   ??f¬ 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 1 1 1 0 0 1 1 0 0 1 1 0 1 0 1 0 1 0 1 0 [12:12:49.51] P:\test>

    Examine what is said, not who speaks.        The end of an era!
    "But you should never overestimate the ingenuity of the sceptics to come up with a counter-argument." -Myles Allen
    "Think for yourself!" - Abigail        "Time is a poor substitute for thought"--theorbtwo         "Efficiency is intelligent laziness." -David Dunham
    "Memory, processor, disk in that order on the hardware side. Algorithm, algorithm, algorithm on the code side." - tachyon

      Hooray! Yes, vec() was the function I was looking for, indeed. This will avoid the bitstring solution altogether, something which nearly everyone has recommended to me up to this point.

      Thank you, BrowserUk, for restoring my faith and showing me the light. I was beginning to wonder if I would have to wait until perl6 (and hope that it included stronger typing in this fashion..) I'll work on the C-ish thinking, as well.

      Off to check perldelta to see when vec() arrived...

      Cheers,
      -Jason

Re: perl bit matrix transposition / weak typing problem
by ysth (Canon) on Dec 13, 2004 at 10:59 UTC
    Umm, transpose() doesn't return anything. Adding a return pack("C*",@out) makes it do what you want. If it doesn't for you, could you say what perl version you are using? And exactly what you are seeing? "Cries foul" isn't descriptive enough.

    Update: another way:

    sub transpose { return if length $_[0] < $BUFSIZE; pack"B*",join"",(split//,unpack"B*",substr$_[0],0,$BUFSIZE)[map$_% +$BUFSIZE*$BUFSIZE+int($_/$BUFSIZE),0..$BUFSIZE**2-1] }
    Just as background, in case it helps with your problem, note that perl's bitops have two different modes, numeric (where the scalar operands are numbers) and bitwise, where the operands are strings and the operation is done on each bit of each character. The numeric mode is triggered whenever either of the operands was set to a number or has been used in a numeric context. For example, 3|8 is 11, but "3"|"8" is ";". You can force one or the other regardless of where a scalar's been by replacing one of the operands, say $x, with (0+$x) to force numeric mode, or stringize both, so $x^$y becomes "$x"^"$y", to force bitwise mode. See perlop for more info.

      Yes, my mistake on the transpose function with no return value. I have updated it correctly. I had trimmed it down to avoid warlording and posting a million lines of irrelevant code, and inadvertently chopped the return line as well :)

      Anyway, the pack/unpack method above converts to a bitstring, then back again, does it not? Using a sequence of ascii '1's and '0's was precisely what I was trying to avoid... and I *know* perl can accomplish this task in that way. But, that method increases the memory requirements of the implementation eight-fold, adds internal conversion overhead, and doesn't fall into the design requirements of the original question :>

      I hope this isn't misconstrued, but remember, this function has to transpose *every* $BUFSIZE bytes, and quickly, with low memory requirements, to be able to sit at the position in the protocol stack that it does.

      Thanks,
      Jason

        I'm not sure what you are saying; does your original example work for you but is not fast enough?

        I see you have return $out now, but I don't see you setting $out.

        Update: I reread your original post and now see where you talk about trying to avoid the very kind of approach you present there. But I don't see where the trampling you describe comes into it. Do see BrowserUk's suggestion of using vec, which is perl's builtin way of dealing with data by bits without conversion.

Re: perl bit matrix transposition / weak typing problem
by bart (Canon) on Dec 13, 2004 at 11:26 UTC
    I have already written a reference implementation in C that met with great success.

    Then why not use Inline::C to include it in your module.

Re: perl bit matrix transposition / weak typing problem
by sasikumar (Monk) on Dec 13, 2004 at 11:19 UTC
    Hi
    The transpose function returns nothing
    What are u trying to sdo in the transpose?

    The unpack should have a equal pack statement to
    accomplish your task. just add
    pack("C*",@out)
    in your transpose function
    However, the question still remained in the back of my mind,
    nagging at me like a forgotten birthday. Can perl's
    weakly typed scalars and automatic type conversions accomplish this task without trampling
    my values and having to resort to bitstrings?

    Perl offers you the complete flexibility. Make sure that we dont misuse it and blame on it.

    Thanks
    Sasi Kumar