in reply to Understanding endianness of a number
In many cases, endianness is transparent to the programmer, except when explicitly converting between numbers and bytes or when you need to look at the memory directly. In that way, it's kind of like a character encoding: in your program, you work mostly with strings and characters, usually not caring how they're stored internally, and only when converting to and from streams of bytes does it become important how those more abstract notions of characters are represented as bytes (e.g. UTF-8 vs. UTF-16 vs. many more). The same way, in a Perl program, you can say my $x = 48879;, and you don't have to care how that number gets represented in memory, until you have to think about how to read or write it to a binary file or send/receive it over a data link as a series of bytes. In both cases, there are two levels of thinking here, the "more abstract" notion of numbers/characters, versus the machine-level bytes, and explicit conversion is needed between the two. The conceptual issues arise because this conversion is many times implicit instead of explicit, and so programmers don't often have to think about it.
(For the purpose of this explanation, assume byte addressable memory everywhere, and let's ignore that modern machines of course work with words of multiple bytes. Your question comes from I2C anyway, which works on the byte level.)
So assuming you want to store my $x = 48879;, or my $x = 0xBEEF; as a 16-bit unsigned integer in two bytes, there are two ways to do that: with the most significant byte 0xBE at the lower memory address, or at the higher one (or in the case of a protocol, 0xBE being transmitted first, or second).
48879 = 0xBEEF ^ ^ 0xBE = MSB LSB = 0xEF Memory Address: 0 | 1 | 2 | 3 | | | Little Endian: ... | LSB | MSB | ... | 0xEF | 0xBE | | | | Big Endian: ... | MSB | LSB | ... | 0xBE | 0xEF | # little endian $ perl -MData::Dump -e 'dd pack "S<", 0xBEEF' "\xEF\xBE" # big endian $ perl -MData::Dump -e 'dd pack "S>", 0xBEEF' "\xBE\xEF"
What can sometimes be confusing is that some diagrams of memory addresses or transmission protocols place the least significant bit on the right side of the diagram (because bytes are typically written with their most significant bit first, as in 170 == 0b10101010), but at the same time put the least significant byte lowest memory address on the left, and there are often other variations of this. In fact, if I recall correctly, sorting out this initial left-to-right/right-to-left confusion was probably one of the most important things to help make endianness "click" for me. Another thing to keep in mind is that when you write 0xBEEF in your source code, that's still a single 16-bit value, and not yet two bytes; you don't yet know how it'll be represented in memory.
To answer your two questions: Yes, you're correct. In 0x03FF, the MSB is 0x03 (the "big end") and the LSB is 0xFF (the "little end"), so the first is big endian order since you print the big end first, and the second is little endian order because you print the little end first. But just to be clear, on the other hand, $b1 and $b2 are two unconnected variables - so what you've really got there is two separate bytes, not a two-byte value stored in a certain order. (Update: I wouldn't have picked this nit if you had stored them in an array instead since the array indicies take the place of the memory addresses :-) )
I've for now ignored additional topics like bigger numbers stored as four or more bytes, where at least in theory there are more than two possible orderings, but I hope that if the principle and the 16-bit version makes sense, understanding the documentation for wider values will be easier.
(By the way, I like to shift first and then mask, i.e. $b1 = ($num >> 8) & 0xFF;, because I've been burned on a small microprocessor where the C compiler implemented the bit shift with a rotate instruction instead. I forget which processor and compiler it was though... plus I don't think Perl would run on such a uC, so it's really just a preference I've developed as a result.)
(As the AM post points out, endianness could also refer to the order in which bits of a byte are transmitted, but in my experience, I've pretty much always seen the term endianness referring only to byte order; most protocol descriptions I've read will instead explicitly state "the least/most significant bit is transmitted first/last".)
Minor updates for clarity.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Understanding endianness of a number
by stevieb (Canon) on Jul 23, 2017 at 19:48 UTC | |
by haukex (Archbishop) on Jul 23, 2017 at 20:14 UTC | |
by stevieb (Canon) on Jul 23, 2017 at 20:18 UTC | |
by Marshall (Canon) on Jul 23, 2017 at 22:04 UTC | |
by stevieb (Canon) on Jul 23, 2017 at 22:20 UTC | |
by Marshall (Canon) on Jul 24, 2017 at 03:05 UTC | |
by RonW (Parson) on Jul 26, 2017 at 22:17 UTC | |
by Anonymous Monk on Jul 27, 2017 at 17:04 UTC | |
by choroba (Cardinal) on Jul 27, 2017 at 17:56 UTC | |
|