in reply to Bit order in bytes

A pointer to doc that explains what's behind b vs B decoding would be greatly appreciated.

In terms of single bytes, the difference between 'b' & 'B' templates is purely a matter of cosmetics; that is, the same information is being presented differently:

[0] Perl> print unpack 'B8', chr(65);; 01000001 [0] Perl> print unpack 'b8', chr(65);; 10000010

Nothing changed in the hardware or the internal representation of 'A', just order in which the bits are presented to the user.

  1. 'b' produces lsb -> msb; left to right.
  2. 'B' produces msb -> lsb; left to right.

Where the real difference comes in is when dealing with values greater than one byte:

## spacing and annotation added manually... [0] Perl> print unpack 'b*', "\x12\x34";; 0100 1000 0010 1100 2 1 4 3 [0] Perl> print unpack 'B*', "\x12\x34";; 0001 0010 0011 0100 1 2 3 4

As you can see, not only is the bit-order different, but so is the apparent ordering of the nybbles within the bytes. In part this is due to my using a little-endian hardware. If you are using or have access to a big-endian machine, you'd see different results above, but they would still both be just different ways of viewing the same information.

Again nothing changed in the storage of the values within the memory, the apparent reordering is purely down to the way the bits are displayed.

So, the difference is just an illusion created by the the way you are viewing the bits, and probably not what you should be concentrating on.

Provided you are calculating the correct bit positions for your use of vec in _set_header_field() -- which is a matter of whether you've done your homework correctly -- how you chose to view those bits (ie.with 'b' or 'B') is really down to which makes more sense for you.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^2: Bit order in bytes
by syphilis (Archbishop) on Dec 10, 2013 at 12:08 UTC
    If you are using or have access to a big-endian machine, you'd see different results above

    Actually, it's the same results - these templates apparently know whether they're on a big-endian or little-endian machine, and adjust themselves accordingly to standardise the output.

    Not what I was expecting ... I have, however, just tested this.

    Cheers,
    Rob
      Actually, it's the same results - these templates apparently know whether they're on a big-endian or little-endian machine, and adjust themselves accordingly to standardise the output.

      Hm. That is a surprise.

      Now I am really confused by the apparent nybble swapping:

      print unpack 'b*', "\x12\x34";; 0100 1000 0010 1100 2 1 4 3 print unpack 'B*', "\x12\x34";; 0001 0010 0011 0100 1 2 3 4 print unpack 'b*', pack 'v', 0x1234;; 0010 1100 0100 1000 4 3 2 1 print unpack 'b*', pack 'n', 0x1234;; 0100 1000 0010 1100 2 1 4 3

      I can't make sense of that at all.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        Your pack 'v' and pack 'n' arrange the bytes in reverse order - and unpack 'b*' just unpacks them both from the same end ... hence the bytes come out in the reverse order - but each byte is read in ascending bit order, in accordance with the 'b' template spec.

        Where you're unpacking the \x12\x34, that's also correct. Both 'b*' and 'B*' read the bytes in the same order, but read those bytes from opposing ends.
        I don't think there's any nybble-swapping. You've just got one byte that's read either as 01001000 or (reversed) 00010010 (depending upon the template) - and another byte that's being read as either 00101100 or (reversed) 00110100 (depending upon the template).

        ... I think ...

        Cheers,
        Rob

        Not sure about your statement on swapping but I cannot see any. hex 12 is decimal 18 = 16 + 2 is binary 00010010 and hex 34 is decimal 52 is binary 00110100.

        print unpack 'b*', "\x12\x34";; 01001000 00101100 18 52 print unpack 'B*', "\x12\x34";; 00010010 00110100 18 52
      It's when you're working with floating point that you'd have to deal with the headache you're anticipating. Int formats are considerably more agreeable to deal with across different endian machines.
Re^2: Bit order in bytes
by geoffleach (Scribe) on Dec 10, 2013 at 18:50 UTC
    I'll buy the illusion. Is it the case, then that 'B' decoding is looking at the bytes and presenting them in a big-endian convention? (We're small-endian here.)

    Hopefully this won't confuse things further. Here's the sequence of events in the application.

    o Perl code packs bits into int, numbered 31 to 0.

    o Int is passed to xs code that passes it on to C++ lib

    o C++ coerces the int to unsigned long

    o The long is converted to std::bitset

    o Indexing the bitset, the byte order is correct,

    o The bits are reversed in each byte

      Maybe this will clarify things a little.

      This constructs a union between a unsigned 32-bit integer and a struct containing 32 x 1-bit fields.

      It assigned the value 0x01234567 to the uint and then prints (from C) a string of 0s & 1s to reflect the bitfields, first to last in the struct.

      You'll notice from the output (after __END__), that the first bitfield in the struct maps to the lsb in the integer; and the last to the msb; indicating that this is a little-endian (intel) machine.

      It then passes the uint back to perl, packs it using the little-endian template 'V' and unpacks its bits using both 'b' and 'B'.

      Note that the 'b' template mirrors the ordering of the bits as seen via the bitfields.

      However, once you go beyond that into the realms of C++ coercions and bitsets, I'm afraid your on your own.

      #! perl -slw use strict; use Inline C => Config => BUILD_NOISY => 1,; use Inline C => <<'END_C', NAME => 'bitfields', CLEAN_AFTER_BUILD =>0 +; #include "mytypes.h" union { struct { unsigned b00:1; unsigned b01:1; unsigned b02:1; unsigned b03:1 +; unsigned b04:1; unsigned b05:1; unsigned b06:1; unsigned b07:1 +; unsigned b08:1; unsigned b09:1; unsigned b10:1; unsigned b11:1 +; unsigned b12:1; unsigned b13:1; unsigned b14:1; unsigned b15:1 +; unsigned b16:1; unsigned b17:1; unsigned b18:1; unsigned b19:1 +; unsigned b20:1; unsigned b21:1; unsigned b22:1; unsigned b23:1 +; unsigned b24:1; unsigned b25:1; unsigned b26:1; unsigned b27:1 +; unsigned b28:1; unsigned b29:1; unsigned b30:1; unsigned b31:1 +; } bits; U32 uint; } X; U32 test( SV *unused ) { X.uint = 0x01234567; printf( "%u\n", X.uint ); printf( "%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c%c% +c%c%c%c\n", X.bits.b00 ? '1' : '0', X.bits.b01 ? '1' : '0', X.bits.b02 ? '1' + : '0', X.bits.b03 ? '1' : '0', X.bits.b04 ? '1' : '0', X.bits.b05 ? '1' + : '0', X.bits.b06 ? '1' : '0', X.bits.b07 ? '1' : '0', X.bits.b08 ? '1' + : '0', X.bits.b09 ? '1' : '0', X.bits.b10 ? '1' : '0', X.bits.b11 ? '1' + : '0', X.bits.b12 ? '1' : '0', X.bits.b13 ? '1' : '0', X.bits.b14 ? '1' + : '0', X.bits.b15 ? '1' : '0', X.bits.b16 ? '1' : '0', X.bits.b17 ? '1' + : '0', X.bits.b18 ? '1' : '0', X.bits.b19 ? '1' : '0', X.bits.b20 ? '1' + : '0', X.bits.b21 ? '1' : '0', X.bits.b22 ? '1' : '0', X.bits.b23 ? '1' + : '0', X.bits.b24 ? '1' : '0', X.bits.b25 ? '1' : '0', X.bits.b26 ? '1' + : '0', X.bits.b27 ? '1' : '0', X.bits.b28 ? '1' : '0', X.bits.b29 ? '1' + : '0', X.bits.b30 ? '1' : '0', X.bits.b31 ? '1' : '0' ); return X.uint; } END_C my $uint = test( 1 ); print unpack 'b*', pack 'V', $uint; print unpack 'B*', pack 'V', $uint; __END__ C:\test>bitFields.pl 19088743 11100110101000101100010010000000 11100110101000101100010010000000 01100111010001010010001100000001

      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
      :