5mi11er has asked for the wisdom of the Perl Monks concerning the following question:

Hi all, I was playing with some pack/unpack code dealing with long integers, these just happened to be generated from IP addresses, but that is off topic. I had the working code to figure out whether the code was running on a big or little endian machine as such:
sub is_big_endian { my ($a,$b,$c,$d) = unpack("C4",pack("L",0x01020304)); ($a==0x01) && ($b==0x02) && return 1; #big endian ($a==0x04) && ($b==0x03) && return 0; #little endian }
but I got to thinking that, "Why do I need to pack the long integer 0x01020304 in the first place?" I thought I'd be able to do something like this:
sub is_big_endian { $num=0x01020304; my ($a,$b,$c,$d) = unpack("C4",$num); ($a==0x01) && ($b==0x02) && return 1; #big endian ($a==0x04) && ($b==0x03) && return 0; #little endian }
Which didn't work, running via the debugger, I tried LOTS of variations attempting to make sense of what was actually going on:
DB<1> c 242 main::is_big_endian(test-dns.pl:242): 242: my ($a,$b,$c,$d) = unpack("C4",0x01020304); DB<2> n main::is_big_endian(test-dns.pl:243): 243: ($a==0x01) && ($b==0x02) && return 1; #big endian DB<2> p $a 49 DB<3> p $b 54 DB<4> p $c 57 DB<5> p $d 48 DB<6> p unpack("C4",0x01020304) 49545748 DB<7> p unpack("L",0x01020304) 825637168 DB<12> $a=0x01020304 DB<13> p $a 16909060 DB<14> p unpack("CCCC",$a) 49545748 DB<15> p unpack("C4",$a) 49545748 DB<17> p unpack("C4",pack("L",($a))) 1234 [..stuff removed..] DB<51> p $a 16909060 DB<52> p unpack("b32",pack("L",$a)) 10000000010000001100000000100000 DB<55> p unpack("B32",pack("L",$a)) 00000001000000100000001100000100 DB<3> p unpack ("C*",x01020304) 1204849485048514852 DB<4> p unpack ("C*",0x01020304) 4954574857485448 DB<5> p unpack ("C*",0x00000001020304) 4954574857485448 DB<10> print join(" ", map { sprintf "%#02x", $_ } unpack("C*",pack( +"L",0x12345678))), "\n"; 0x12 0x34 0x56 0x78 DB<11> print join(" ", map { sprintf "%#02x", $_ } unpack("C*",0x123 +45678)), "\n"; 0x33 0x30 0x35 0x34 0x31 0x39 0x38 0x39 0x36
Then I read that internally, all numbers are represented as double precision floats, so, I tried those as well:
DB<15> print join(" ", map { sprintf "%#02x", $_ } unpack("C*",pack( +"D",$a))), "\n"; 0x41 0xb2 0x34 0x56 0x78 00 00 00 DB<16> print join(" ", map { sprintf "%#02x", $_ } unpack("C*",pack( +"D*",$a))), "\n"; 0x41 0xb2 0x34 0x56 0x78 00 00 00 DB<17> print join(" ", map { sprintf "%#02x", $_ } unpack("C*",pack( +"d*",$a))), "\n"; 0x41 0xb2 0x34 0x56 0x78 00 00 00 DB<18> print join(" ", map { sprintf "%#02x", $_ } unpack("C*",pack( +"d",$a))), "\n"; 0x41 0xb2 0x34 0x56 0x78 00 00 00
And those didn't match what I'd gotten before. So my question boils down to: "What is *actually* going on internally to give me the answer I get when doing an unpack("C",$a) where $a=0x01020304?"

Replies are listed 'Best First'.
Re: pack, unpack and internal represenation of numbers
by blokhead (Monsignor) on Jul 01, 2004 at 20:34 UTC
    pack and unpack convert things to and from an internal string representation. In your example, unpack interprets the scalar 0x01020304 in a string context, which is "16909060".

    On the other hand, pack("L", 0x01020304) returns a string which is along the lines of "\x01\x02\x03\x04".

    blokhead

      Yes!

      Explaining it fully for those that don't understand yet:

      From the string "16909060", '1'=ASCII(49) '6'=ASCII(54) '9'=ASCII(57) '0'=ASCII(48) etc.
      Thank you, that's the understanding I was looking for!
Re: pack, unpack and internal represenation of numbers
by BrowserUk (Patriarch) on Jul 01, 2004 at 20:48 UTC

    As an aside, you can find the endianness(?) of the machine from Config.

    use Config; if( $Config{ byteorder } eq '1234' ) { print "Little endian"; } elsif ( $Config{ byteorder } eq '4321' ) { print "Big endian"; } else{ print "Your screwed!"; }

    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "Think for yourself!" - Abigail
    "Memory, processor, disk in that order on the hardware side. Algorithm, algoritm, algorithm on the code side." - tachyon
Re: pack, unpack and internal represenation of numbers
by hardburn (Abbot) on Jul 01, 2004 at 20:29 UTC

    From the pack documentation:

    The integer formats s, S, i, I, l, L, j, and J are inherently non-portable between processors and operating systems because they obey the native byteorder and endianness.

    Which should be taken to imply that the other formats do not follow the endian-ness of the platform. So your orginal L solution was correct in this case, because dealing with the endian-ness is the whole point.

    ----
    send money to your kernel via the boot loader.. This and more wisdom available from Markov Hardburn.

      Ok, but shouldn't a number, at run time, "obey the native byte order and endianness" as well?

      If this integer were stored in integer form internally, and I unpack that into short ints, I should get some pattern of 0x01, 0x02, 0x03, and 0x04 depending on the endianness.

      Perl's internal representation is not integer, but it is also not "double float"; at least not as defined by pack/unpack.

      I think maybe what I'm really asking is: "How do I go about packing 0x01020304 such that I get "49545748" as my answer?" Realizing of course, this answer is valid only on similar architecture machines...

Re: pack, unpack and internal represenation of numbers
by gaal (Parson) on Jul 01, 2004 at 22:40 UTC
    What everybody said, plus, even if you weren't being bitten by a string representation of a number, nobody promised you that your number would be kept in an int (and not, say, a double). See perlnumber for the gory details.