Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to learn how to better work with binary files using perl instead of ansi C. It seems like a much better language, given the unpack function. I have choosen to use an IP packet for this exercise and am trying to break the IP header into its various pieces, so far, with little luck.

Just trying to read the first 32 bits, IP Version (4 bits), Header Length (4 bits), Type of service field (8 bits), and the total length field (16 bits). The following is the code that I am using to read these 32 bits, obviously I have not grasped something from the pack/unpack documents.
#!/usr/bin/perl -w use strict; my $fname = "ip.header"; open(FILE,$fname) or die $!; binmode(FILE); #read the first 32 bits read(FILE, my $foo, 4); my $ver = unpack "b4", $foo; my $hlen = unpack "b4", $foo; my $tos = unpack "b8", $foo; my $len = unpack "n", $foo; print "Ver = $ver, hlen = $hlen, tos = $tos, len = $len\n"; close FILE;
The results from this is as follows:
Ver = 1111, hlen = 1111, tos = 11111100, len = 16194

Any enlightenment would be wonderful. As this is an attempt to learn pack/unpack better, please do not point to modules that will parse IP headers for me, i thnk that would be rather self defeating in the long run.

Replies are listed 'Best First'.
Re: IP Header
by Aristotle (Chancellor) on Nov 02, 2002 at 21:42 UTC
    Your problem is that each unpack starts over at bit 0 of the binary data. But unpack is perfectly capable of returning any number of fields in one fell swoop, so this is what you really wanted: my ($ver, $hlen, $tos, $len) = unpack "b4b4b8n", $foo;

    Makeshifts last the longest.

      Just to throw a wrench in the works ;-) The binary data is packed in Network bit order, won't pulling it out using C2 not respect that? So i tried this:
      my ($data) = unpack "N", $foo; my ($ver, $hlen) = (vec($data,0,4), vec($data,1,4)); my $tos = vec($data, 1, 8); my $len = vec($data, 1, 16);
      Which returned completely un-expected results. Is it not possible to use vec on data pulled using "N"? (yes, i know it can be more compact, split up for readability.)
        You misunderstand the purpose of that format. The network ordering is important for multibyte fields. The order of bytes inside such a field may differ from what the host platform expects. The order of fields however does not change, regardless of platform. Two byte-sized fields are always extracted using C2; a single 16-bit field is extracted using n.

        Makeshifts last the longest.

      My understanding of the way IP headers are formed is that $ver should equal 4 when all is said and done. So, while grabbing it all in one swoop solves the repeating data I am still not obtaining the expected results.

      If you have 4 bits that represent, say, an integer how do you unpack to a usable value?
        Oh. Silly me. You have a problem there: unpack consumes its input in byte chunks. Even if you specify b4, it will consume a full byte. (If you specificy b10, it will eat two full bytes.) Note also that b8 will return something of the form 00011111, not 31 - for that purpose you need C (unsigned character). For in-byte bit fiddling, you need vec:
        my ($ver_hlen, $tos, $len) = unpack "C2n", $foo; my ($ver, $hlen) = (vec($ver_hlen, 4, 0), vec($ver_hlen, 4, 1));
        or more compactly my ($ver, $hlen) = map( vec($ver_hlen, 4, $_), (0 .. 1) );
        See perldoc -f vec for details.

        Makeshifts last the longest.