magawake has asked for the wisdom of the Perl Monks concerning the following question:

I have a binary file and I would like to get the first,second and third field of the line The first field is with offset zero (ofcourse) which has a size of 2 bytes and its format is binary integer format. The second field is offset 2 and size of 2 bytes with binary integer. The third field has offset of 4 with 2 bytes with binary integer data. I was wondering what is the easiest way to do this? I am getting confused on how to use unpack. also, lets assume one of these fields is ASCII -- Offset 15, size 1 --, how could I see this string? Any thoughts? TIA

  • All field sizes are fixed and constant.
  • All fields are contiguous. There is no explicit 'padding' between fields regardless of the data types, size and alignment issues.
  • The binary fields are provided in Big Engined Format which is unsigned
  • ASCII strings are left align, null padded
  • Replies are listed 'Best First'.
    Re: unpack offset question
    by ikegami (Patriarch) on Mar 28, 2009 at 04:39 UTC

      ( Please avoid putting your entire post in code tags. Just put <p> at the start of every paragraph instead. )

      You didn't specify byte order and whether the numbers are signed or unsigned.

      @fields = unpack('n3', $_); # unsigned big-endian @fields = unpack('v3', $_); # unsigned little-endian @fields = unpack('S3', $_); # unsigned native @fields = unpack('n!3', $_); # signed big-endian @fields = unpack('v!3', $_); # signed little-endian @fields = unpack('s3', $_); # signed native

      You provided even less information about the strings.

      • Is the field fixed width or variable width.
      • If the field is fixed width, how do you know how much of the field is the string?
      • If the field is variable width, how do you know how many bytes the string is?
      • What character encoding is used? (Did you really mean ASCII?)

      Update: n! and s! were introduced in 5.10. Pre 5.10,

      # signed big-endian @fields = map { unpack 's', pack 'S', $_ } unpack('n3', $_); # signed little-endian @fields = map { unpack 's', pack 'S', $_ } unpack('v3', $_);

      Update: Oops, had S for both signed and unsigned at the top. Thanks AnomalousMonk.

        Thankyou for the kind response.
      • All field sizes are fixed and constant.
      • All fields are contiguous. There is no explicit 'padding' between fields regardless of the data types, size and allignment issues.
      • The binary fields are provided in Big Engined Format which is unsigned
      • ASCII strings are left align, null padded
          'Z6' for a NUL padded str 6-bytes long. The NULs, if any, are removed.
    Re: unpack offset question
    by targetsmart (Curate) on Mar 28, 2009 at 10:07 UTC
      I am getting consumed on how to use unpack
      Pack/Unpack Tutorial (aka How the System Stores Data)
      perlpacktut

      Vivek
      -- In accordance with the prarabdha of each, the One whose function it is to ordain makes each to act. What will not happen will never happen, whatever effort one may put forth. And what will happen will not fail to happen, however much one may seek to prevent it. This is certain. The part of wisdom therefore is to stay quiet.
    Re: unpack offset question
    by magawake (Novice) on Mar 29, 2009 at 16:58 UTC
      Thankyou. This posts helps a lot. But, what if I have an ASCII string (left aligned null padded) at offset 15 with 1 byte. How can I extract only that?