"be consistent" PerlMonks

### Extracting Bit Fields (Again)

by ozboomer (Friar)
 on Oct 12, 2005 at 03:35 UTC Need Help??

ozboomer has asked for the wisdom of the Perl Monks concerning the following question:

I still can't get my head around this bit manipulation very well at the moment. Maybe its' just that I'm full of 'flu and I'm not seein' the obvious... Still, the current problem:

An 8-bit byte that has 3 bit-fields in it, thus:

```  bbbbbbbb
||  |
||  +-- D3 (4 bits)
|+----- D2 (3 bits)
+------ D1 (1 bit)
As an example, the value 162 decimal (10100010 binary) would give:-

```D1 = 1, D2 = 2, D3 = 2
...but try as a I might, I just can't get the thing out. unpack(b1b3b4, \$value) doesn't work, using vec() doesnt' help either.

I can logically "&" the value and get the bits selected out but I'd then need to do a 'rotate right' or something to get them into usable numbers.

Any pointers on how to get 'round this?

Replies are listed 'Best First'.
Re: Extracting Bit Fields (Again)
by ikegami (Patriarch) on Oct 12, 2005 at 03:53 UTC

There are no rotate readily available in Perl, but Perl does have shift operators (<< and >>) and that's all you need:

```\$D1 = (\$num >> 7) & 0x01;  # 1 bit  starting at bit 7.
\$D2 = (\$num >> 4) & 0x07;  # 3 bits starting at bit 4.
\$D3 = (\$num >> 0) & 0x0F;  # 4 bits starting at bit 0.

The above is much easier than using unpack because unpack('b', ...) returns a string representation of the number. You can work around it as follows (for fields no bigger than 8 bits):

```@D = map { unpack('C', pack('b*', \$_)) } unpack('b1b3b4', 162);

This is the approach I’d take, as well.

Now if you go one step further, you can reorder and rewrite this like so:

```\$D3 = ( \$num >> 0 ) & ( 1 << 4 ) - 1;  # 4 bits starting at bit 0
\$D2 = ( \$num >> 4 ) & ( 1 << 3 ) - 1;  # 3 bits starting at bit 4
\$D1 = ( \$num >> 7 ) & ( 1 << 1 ) - 1;  # 1 bit  starting at bit 7

And then destructively rewrite the input:

```\$D3 = \$num & ( 1 << 4 ) - 1;  # 4 bits starting at bit 0
\$num = \$num >> 4;

\$D2 = \$num & ( 1 << 3 ) - 1;  # 3 bits starting at bit 4
\$num = \$num >> 3;

\$D1 = \$num & ( 1 << 1 ) - 1;  # 1 bit  starting at bit 7
\$num = \$num >> 1;

Hmm…

```@width = ( 4, 3, 1 );

\$D3 = \$num & ( 1 << \$width[ 0 ] ) - 1;  # 4 bits starting at bit 0
\$num = \$num >> \$width[0];

\$D2 = \$num & ( 1 << \$width[ 1 ] ) - 1;  # 3 bits starting at bit 4
\$num = \$num >> \$width[ 1 ];

\$D1 = \$num & ( 1 << \$width[ 2 ] ) - 1;  # 1 bit  starting at bit 7
\$num = \$num >> \$width[ 2 ];

So obviously:

```sub bitfield {
my ( \$num, @field ) = @_;
my @value;
for my \$width ( @field ) {
push @value, \$num & ( 1 << \$width ) - 1;
\$num = \$num >> \$width;
}
return @value;
}

Unfortunately, this works only for bitfields up to the architecture integer size. It is possible with some contortions to cut fields from a string longer than 32 or 64 bits if you pluck the right pieces out manually, but no individual bitfield can ever be longer than 32/64 bits.

Makeshifts last the longest.

There is a rotate... I'd just forgotten about it... and this way is the simplest and clearest for what I need to do:-
```\$D1 = (\$intbuf & 0x80) ? 1 : 0;
\$D2 = (\$intbuf >> 4) & 0x07;
\$D3 = (\$intbuf & 0x0F);
My fogged brain can now keep struggling away at this job...

Many, many thanks for all your help, folks.

There is a rotate...
huh? no Or are you talking about some module?
Re: Extracting Bit Fields (Again)
by BrowserUk (Patriarch) on Oct 12, 2005 at 04:26 UTC

Here is one way to do it (but you aren't going to like it!):

```print unpack 'C*', pack '(B8)*', map{
substr '00000000'.\$_, -8
} unpack 'A1A3A4',unpack 'B*',pack 'C', 162;;

1 2 2

To explain that, (right-to-left):

```## pack 162 into an 8-bit binary value (a char).
pack 'C', 162

## unpack that into it's bits, (asciized binary bitstring).
unpack 'B*',

## split that into the 3 fields. ('1','010', '0010' ).
unpack 'A1A3A4',

## Pad each to the smallest size (8 bits) that Perl can deal with nume
+rically.
map{ substr '00000000'.\$_, -8 }

## Pack them back up to binary values (chars).
pack '(B8)*',

## And unpack them back to numeric values.
unpack 'C*',

## and convert those to ascii-decimal for display
print

The problem with vec is that it only allows you to deal with bits in quantites that are powers of 2, and on powers of two boundaries, which makes dealing with your 3-bit field problematic.

Probably easier to use is a subroutine like this:

```#! perl -slw
use strict;

sub bitField {
my( \$value, \$offset, \$size ) = @_;
my \$mask = ( ( 1<<\$size) - 1 ) << \$offset;
return ( \$value & \$mask ) >> \$offset;
}

print bitField 162, 7, 1;
print bitField 162, 4, 3;
print bitField 162, 0, 4;

__END__
P:\test>499354
1
2
2

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.
I really liked this code snippet and the detailed explanation that you have provided. I have a requirement to parse a 32-bit register with bit fields and produce a human readable values for each field. For example, I have a register definition like: LinkStatus: 2 # 2 bits for this field CardStatus: 3 Reserved: 7 IntrStatus: 15 etc..etc Given a hex value, I would like to output something like: LinkStatus = 3 CardStatus = 0 Reserved = 0 IntrStatus = 200 etc... I tried to use the above method, but I guess the field sizes are in 8 bits.. Wondering if we can use pack/unpack for this or should I use the more classical bit extraction operators (<< and >> )? Appreciate pointers on this. Thanks.

Maybe something like this would work for you? (Update: simplified code):

Update 2: Unsimplified code to correct errors noted by oyster2011 below (Thanks!):

```#! perl -slw
use strict;

sub bitFields{
my( \$val, \$pat ) = @_;
my @fields = unpack \$pat, reverse unpack 'b32', pack 'L', \$val;
return unpack 'L*', pack '(b32)*', map scalar( reverse), @fields;
}

my( \$linkStatus, \$cardStatus, \$reserved, \$intrStatus ) =
bitFields( 123456789, 'x2 a2 a3 a7 a15' );

print for \$linkStatus, \$cardStatus, \$reserved, \$intrStatus;

__END__
C:\test>bitFields.pl
0
3
86
31138

Use 'xn' in the template to skip over bits you aren't interested in. You might need to use 'B32' instead of 'b32' depending upon your platform.

Masking and shifting is almost certainly quicker, but this could be seen as a nice abstraction.

You could also do away with the intermediates if that floats your boat:

```sub bitFields{
unpack 'L*', pack'(b32)*', map scalar( reverse),
unpack \$_[1], reverse unpack 'b32', pack 'L', \$_[0];
}

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

The start of some sanity?

Re: Extracting Bit Fields (Again)
by GrandFather (Saint) on Oct 12, 2005 at 03:42 UTC
```use warnings;
use strict;

my \$value = 162;
my (\$b1, \$b2, \$b3) = unpack 'b1b3b4', \$value;
print "\$b1 \$b2 \$b3";

Prints 1 011 0100 which may not be what you expected. What would you like to see?

Perl is Huffman encoded by design.
Re: Extracting Bit Fields (Again)
by pg (Canon) on Oct 12, 2005 at 03:58 UTC

Here is a simple base converter, with less than 10 lines of code, it gives you what you wanted and more. It is not difficult for you to continue from here and extract the specific digits, substring() if you wish.

```use strict;
use warnings;

for (2 .. 9) {
print "base \$_: " . convert(162, \$_), "\n";
}

sub convert {
my (\$value, \$base) = @_;
my \$converted = "";
while (\$value) {
\$converted = (\$value % \$base) . \$converted;
\$value = int(\$value / \$base);
}
return \$converted;
}

This gives:

```base 2: 10100010
base 3: 20000
base 4: 2202
base 5: 1122
base 6: 430
base 7: 321
base 8: 242
base 9: 200
Re: Extracting Bit Fields (Again)
by ysth (Canon) on Oct 12, 2005 at 08:45 UTC
If you aren't using a very old perl, this should do it (update: slight tweaks):
```my (\$d1, \$d2, \$d3) = map oct "0b\$_", sprintf("%08b", 162) =~ /(.)(...)
+(....)\$/;
I'd appreciate it if someone could explain just what unpack "b1b3b4", 162 actually does; I've never found the pack documentation to be particularly informative.

Not quite what people above think it does.

```perl> printf "%b\n", 162;;
10100010
perl> print unpack 'b1b3b4', 162;;
1 011 0100
perl> print unpack 'B1B3B4', 162;; ## In case of different endian mach
+ines.
0 001 0011

Update: The 'Bn' and 'bn' formats unpack bits from bytes, starting from least or most significant bits, but each 'Bn' unpack bits from a whole number of bytes. And always starting at a byte boundary.

So 'b1b3b4' unpacks 1, 3, & 4 bits from each of 3 different bytes. Even if there are more bits in the current byte, and new byte is started when the next Bn is processed.

Vis:

```perl> print unpack '(B8)4', pack 'N', 0b11111111_10101010_01010101_000
+00000;;
11111111 10101010 01010101 00000000

perl> print unpack '(B4)4', pack 'N', 0b11111111_10101010_01010101_000
+00000;;
1111 1010 0101 0000

And if n>8, then more than one byte are used for the 'bn', but a new byte is started for the next format.

```perl> print unpack '(B16)2', pack 'N', 0b11111111_10101010_01010101_00
+000000;;
1111111110101010 0101010100000000

perl> print unpack '(B12)2', pack 'N', 0b11111111_10101010_01010101_00
+000000;;
111111111010 010101010000

But the significant thing here is that I have packed the input to a binary value.

With pack 'b1b3b4', 162 quite where the input is drawn from is a mystery. the 162 obviously has a binary representation within the scalar, but this would be in the SvIV, and if it were that being decoded you would get this result:

```perl> print unpack '(b8)3', pack 'V', 162;;
01000101 00000000 00000000

perl> print unpack 'b1 b3 b4', pack 'V', 162;;
0 000 0000

Ie. The first 1, 3, & 4 bits of the first 3 bytes of the 4-byte binary rep of 162--but you don't.

You get this:

```perl> print unpack '(B8)3', 162;;
00110001 00110110 00110010

perl> print unpack 'B1B3B4', 162;;
0 001 0011

which shows where the 1, 011 & 01 come from. The first 1, 3 & 4 bits of three consecutive bytes, but that still leaves the question of where those bytes come from?

And the answer appears to be from the ascii values for 1, 6 & 2. And my guess is that as the scalar into which the bareword 162 is placed has never been used in a numeric context, so it is the string representation that is being unpacked. Hence this gives the same result:

```perl> print unpack 'B1B3B4', '162';;
0 001 0011

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.
Thanks; I had suspected it was using the string "162", but it didn't occur to me it might be applying each B/b to a new byte.

Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://499354]
Approved by GrandFather
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (3)
As of 2022-05-16 07:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
Do you prefer to work remotely?

Results (62 votes). Check out past polls.

Notices?