Good evening, monks,

I have run into an issue where automagic type conversions have become a bane rather than a boon.. though I am definitely not ruling out a PEBCAK error.

I am designing a binary network protocol, and went to write a prototype implementation in my esteemed quick prototyping language, thanks to IO::Socket.

In any case, the protocol includes an error checking routine, that should theoretically be resistant to long burst errors (as errors usually come in droves). The theory is that by taking a squarable sequence of bits (8 bytes = 8x8 bit matrix, 32 bytes = 16x16 bit matrix, etc.), a parity bit can be added to each block, and then the bit matrix can be transposed at the source end, and retransposed at the destination end. This will create a situation where bursts of errors, proportional to the size of the transposed matrix, can be automatically corrected without resending the data.

If your head didn't already explode, the direct perl issue at hand is: perl's weak typing is turning my '00000000' into ascii '0' (00110000), and so forth. I am aware that I can accomplish this with pack/unpack, but this has the side-effect of increasing the memory required eight-fold, as well as reducing throughput speed, which, in the position of a potentially high-capacity network filter layer, simply is not feasible.

Here is a visualization of the method, using a simple 8x8 bit matrix:

input: desired output: scalar with this scalar with this bit pattern: bit pattern ('ABCDEFGH') 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 1 0 1 1 1 1 1 1 1 1 1 2 0 1 0 0 0 0 1 1 2 0 0 0 0 0 0 0 0 3 0 1 0 0 0 1 0 0 3 0 0 0 0 0 0 0 0 4 0 1 0 0 0 1 0 1 4 0 0 0 0 0 0 0 1 5 0 1 0 0 0 1 1 0 5 0 0 0 1 1 1 1 0 6 0 1 0 0 0 1 1 1 6 0 1 1 0 0 1 1 0 7 0 1 0 0 1 0 0 0 7 1 0 1 0 1 0 1 0

(Note: I am skipping the parity code for now, as this is easily done and irrelevant to the problem at hand)

This is perl5.8.1 on Linux 2.4.x. The source data file starts with ABCDEFGH, which *happen* to be ascii, but the point is that it is in fact arbitrary binary data and must be representable and treated as such, regardless of the content of the actual bytes. I have also tried the Bit::Vector and Math::MatrixBool modules, with similar results; the type of the $scalar, should it happen to be representable in ASCII, is trampled upon.

#!/usr/bin/perl use strict; use warnings; my $BUFSIZE = 8; my $data; # read the data file open(FH,"chargen") or die; binmode(FH,":raw"); read(FH,$data,$BUFSIZE); # $data contains 'ABCDEFGH' close(FH); print "Source:\n"; foreach (split //, $data) { to_binary($_); } $data = transpose($data); print "Destination:\n"; # --> cries foul in the sub for the next line foreach (split //, $data) { to_binary($_); } sub transpose { my $buf = shift; return if (length($buf) < $BUFSIZE); ($buf) = substr($buf,0,$BUFSIZE) if (length($buf) > $BUFSIZE); my @in = unpack("C*",$buf); # get only ascii vals my @out = (0,0,0,0,0,0,0,0); my $newrow = 0; for my $row (reverse 0..($BUFSIZE-1)) { for my $col (0..($BUFSIZE-1)) { $out[$newrow] |= (1 << $col) if ($in[7 - $col] >> $row & 1); } $newrow++; } return pack("C*",@out); } sub to_binary { my $byte = shift; ($byte) = split //, $byte if (length($byte) > 1); for (reverse 0..($BUFSIZE-1)) { my $b = ord($byte); print $b >> $_ & 1 ? "1" : "0"; } print "\n"; }

Time was an issue, and I have already written a reference implementation in C that met with great success. However, the question still remained in the back of my mind, nagging at me like a forgotten birthday. Can perl's weakly typed scalars and automatic type conversions accomplish this task without trampling my values and having to resort to bitstrings?

Thank you,
Jason

2004-12-15 Janitored by Arunbear - added readmore tags, as per Monastery guidelines


In reply to perl bit matrix transposition / weak typing problem by infi

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.