Subop has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks, I need your help!

I'm pretty new to programming and have been trying to understand (un)pack. However, the problem I have is trying to determine the templates.

Here is a small snippet of code from ChaosReader (http://chaosreader.sourceforge.net/)

### Unpack IP data ($ip_verNihl,$ip_tos,$ip_length,$ip_ident,$ip_flagNfrag, $ip_ttl,$ip_protocol,$ip_checksum,@ip_src[0..3], @ip_dest[0..3],$ip_data) = unpack('CCnnnCCa2CCCCCCCCa*', $ether_data);

How am I supposed to know that "CCnnnCCa2CCCCCCCCa*" is the correct template when decoding IP packets? It's network data so I would think I should be using "n" or "N" all the time, but it's only used three times in the above code.

Any help is appreciated.

Replies are listed 'Best First'.
Re: Need help with (un)pack templates
by gmargo (Hermit) on Dec 20, 2009 at 13:57 UTC

    CCnnnCCa2CCCCCCCCa*

    IP v4 Header format
    C 4-bit version, 4-bit header length
    C 8-bit type of service
    n 16-bit total length (in bytes)
    n 16-bit identification
    n 3-bit flags, 13-bit fragment offset
    C 8-bit time to live
    C 8-bit protocol
    a2 16-bit header checksum
    CCCC 32-bit source IP address
    CCCC 32-bit destination IP address
    a* data if any (assuming no IP header options)

    Wonder why they used a2 instead of n for the checksum?

      CCnnnCCa2CCCCCCCCa*

      Another way to look at this is to realize that whitespace is ignored in a pack/unpack template and that template specifiers can be quantified. The original unpack template might have been written as
          'C C n n n C C a2 C4 C4 a*'
      to slightly clarify just what is going where and that in, e.g.,
          ..., @ip_src[0..3], ... = unpack( ... C4 ... );
      the array slice receives four unsigned bytes.

      Hey gmargo, that's exactly what I was looking for. Is there a site you got that from? I'd also like to be able to figure out how to decode other protocols as well.

        I typed it in from the front inside cover of my Stevens book (TCP/IP Illustrated Volume 1).

        Wikipedia has it too: http://en.wikipedia.org/wiki/IPv4, but the real gospel is RFC791: http://www.ietf.org/rfc/rfc791.txt.

        Update: Other protocol standards you might like:
        RFC 768 User Datagram Protocol
        RFC 792 Internet Control Message Protocol
        RFC 793 Transmission Control Protocol

Re: Need help with (un)pack templates
by BrowserUk (Patriarch) on Dec 20, 2009 at 13:43 UTC

    C is a 8-bit byte. a2 is a two byte string; a* a variable length string.

    Bytes and strings do not have "endianess". Only multi-byte numbers.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      Hey BrowserUk!

      I've read the docs on (un)pack, so I understand what the characters in the template represent. I just don't understand when I'm supposed to use those characters. For example, in the above code for decoding an IP packet, which is network data-- How am I supposed to know I shouldn't be using "n" or "N" to decode the network data?

        How am I supposed to know I shouldn't be using "n" or "N" to decode the network data?

        That's a tough question to answer in full. Essentially, you have to know what format the data you are trying to decode is in.

        For example,

        • Fields 1 & 2 are both 4-bits, hence the template you have extracts them both as a single 8.bit number, which is presumably broken down further later in the code.
        • Field 3 is is an 8-bit number, and extracted as such.
        • Field 4 is a 16-bit number, hence they use 'n' to extract it.
        • Field 5 Ditto.
        • Fields 6 & 7 are 3 & 13 bits respectively. But as #7 crosses a byte boundary, they extract the two as a single 16-bit number (the third 'n'), and then (presumably) break that down furtehr later in the code.
        • etc. ...

        Without knowing what the format of the data you have is, there is no way to know what templates are applicable.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Need help with (un)pack templates
by planetscape (Chancellor) on Dec 20, 2009 at 22:07 UTC