comment on

I'm working with a data format where the file consists of contiguous binary records. The first two bytes are a packed integer and contain the record length inclusive of the packed integer. This means that for 142 bytes of data the record length is 144 (to include the leading two bytes as well).

I figured out that I could use the single unpack format '(n/a)*' if the record length didn't include those two bytes. I tried modifying that format to read '(n/a XX)*' to get unpack to step back two bytes so as to align correctly. That doesn't parse as valid perl. I ended up writing a while() loop to get the job done but I'm left wondering if there was a way to word that unpack format so it would work correctly. Any ideas?

    # Sample working while() loop
    $sourceMetacodeLen = length $sourceMetacode;
    $sourceMetacodePos = 0;
    while ($sourceMetacodePos < $sourceMetacodeLen)
      {
        $recordLen = unpack( 'n',
                             substr $sourceMetacode,
                             $sourceMetacodePos,
                             2);
        $record = substr( $sourceMetacode,
                          $sourceMetacodePos + 2,
                          $recordLen - 2);

# not relevant to the example
#        $parse .= ${translate_record(\$record, $sourceMetacodePos, \@
+fonts)};

        $sourceMetacodePos += $recordLen;
      }
[download]

And here is an example of what the data looks like (after adding some newlines)

$ dd if=3012034-1.met bs=1 count=144|vis
\^@\M^P\^@\^@+$DJDE$   FONTS=(UN104B,HE18BP,HE06NP,HE08OP,HE08NP,HE09BP,
HE10BP,HE14BP,HE10VP,HE12NP,BLANKP,FORMSX,C395L ,HE12BP,HE36SP,HE08BP,HE
10NP),; \^A

$ od -x 3012034-1.met | head
0000000     9000    0000    242b    4a44    4544    2024    2020    4f46
0000020     544e    3d53    5528    314e    3430    2c42    4548    3831
0000040     5042    482c    3045    4e36    2c50    4548    3830    504f
0000060     482c    3045    4e38    2c50    4548    3930    5042    482c
0000100     3145    4230    2c50    4548    3431    5042    482c    3145
0000120     5630    2c50    4548    3231    504e    422c    414c    4b4e
0000140     2c50    4f46    4d52    5853    432c    3933    4c35    2c20
0000160     4548    3231    5042    482c    3345    5336    2c50    4548
0000200     3830    5042    482c    3145    4e30    2950    3b2c    0120
$

__SIG__
printf "You are here %08x\n", unpack "L!", unpack "P4", pack "L!", B::svref_2object(sub{})->OUTSIDE

In reply to Unpacking fixed length records by diotalevi

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.