in reply to Better unpack solution available

It's not really your unpacks that are messy.

But your unpacks could be improved by treating the records as follows the following format:

header body footer ------------ --------- ------ id body_size size path crc32

How I'd write it:

#!/usr/bin/perl use strict; use warnings; my $mem = pack('H*', '54424c004100001300000BAD2F62696E2F7465737432302E +64666C376F6B0F42000013000000042F62696E2F7465737430312E64666C376D6C0F4 +3000013000000042F62696E2F7465737430322E64666C376D6D0F4400001300000004 +2F62696E2F7465737430332E64666C376D6E0F45000013000000042F62696E2F74657 +37430342E64666C376D6F0F46000013000000042F62696E2F7465737430352E64666C +376D700F47000013000000042F62696E2F7465737430362E64666C376D710F4800001 +3000000042F62696E2F7465737430372E64666C376D720F49000013000000042F6269 +6E2F7465737430382E64666C376D730F4A000013000000042F62696E2F74657374303 +92E64666C376D740F4B000013000000042F62696E2F7465737431302E64666C376E6B +0F4C000013000000042F62696E2F7465737431312E64666C376E6C0F4D00001300000 +0042F62696E2F7465737431322E64666C376E6D0F4E000013000000042F62696E2F74 +65737431332E64666C376E6E0F4F000013000000042F62696E2F7465737431342E646 +66C376E6F0F50000013000000042F62696E2F7465737431352E64666C376E700F5100 +0013000000042F62696E2F7465737431362E64666C376E710F52000013000000042F6 +2696E2F7465737431372E64666C376E720F53000013000000042F62696E2F74657374 +31382E64666C376E730F54000013000000042F62696E2F7465737431392E64666C376 +E740F55000011000000042F6574632F74657374322E73683B0C08495600000D000000 +082F477053772E63686B1175D3BB'); (my $section, $mem) = unpack('Z* a*', $mem); print("$section\n"); print( ( "-" x length($section) ), "\n"); while (length($mem)) { (my $id, my $body, my $crc, $mem) = unpack('n n/a N a*', $mem); my ($size, $path) = unpack('N a*', $body); # Check CRC here. print("Load ID : $id\n"); print("Load Size : $size\n"); print("Load Path : $path\n"); print("\n"); }
TBL --- Load ID : 16640 Load Size : 2989 Load Path : /bin/test20.dfl Load ID : 16896 Load Size : 4 Load Path : /bin/test01.dfl Load ID : 17152 Load Size : 4 Load Path : /bin/test02.dfl ...

Update: Added enumeration at the top.

Replies are listed 'Best First'.
Re^2: Better unpack solution available
by Dirk80 (Pilgrim) on Apr 09, 2010 at 21:25 UTC

    Thank you very much for your great answer. I think it is a good way of learning to try it first time by myself and then looking at a better solution.

    Very good idea from you to use 'a*' to read the rest of the memory and shorten it by this way instead of using substr. Also interesting that you were just interpreting the offset field as the length of a body. So the length is directly in front of the body. And another small thing I already knew but did not use. The 'x' operator to underline the string.

    So I learnt a lot from your post. Thank you.

      Hello,

      It's me again. Now I wrote a new version of the script. The first unpack is getting the length of the path. The seoncd unpack is reading one repetition of data in one swoop. I like this solution because it matches the original record the most. Although I have in mind that I can use the length/string technique in the post of ikegami if the length is directly before the string.

      #!/usr/bin/perl use strict; use warnings; my $mem = pack('H*', '54424c004100001300000BAD2F62696E2F7465737432302E +64666C376F6B0F42000013000000042F62696E2F7465737430312E64666C376D6C0F4 +3000013000000042F62696E2F7465737430322E64666C376D6D0F4400001300000004 +2F62696E2F7465737430332E64666C376D6E0F45000013000000042F62696E2F74657 +37430342E64666C376D6F0F46000013000000042F62696E2F7465737430352E64666C +376D700F47000013000000042F62696E2F7465737430362E64666C376D710F4800001 +3000000042F62696E2F7465737430372E64666C376D720F49000013000000042F6269 +6E2F7465737430382E64666C376D730F4A000013000000042F62696E2F74657374303 +92E64666C376D740F4B000013000000042F62696E2F7465737431302E64666C376E6B +0F4C000013000000042F62696E2F7465737431312E64666C376E6C0F4D00001300000 +0042F62696E2F7465737431322E64666C376E6D0F4E000013000000042F62696E2F74 +65737431332E64666C376E6E0F4F000013000000042F62696E2F7465737431342E646 +66C376E6F0F50000013000000042F62696E2F7465737431352E64666C376E700F5100 +0013000000042F62696E2F7465737431362E64666C376E710F52000013000000042F6 +2696E2F7465737431372E64666C376E720F53000013000000042F62696E2F74657374 +31382E64666C376E730F54000013000000042F62696E2F7465737431392E64666C376 +E740F55000011000000042F6574632F74657374322E73683B0C08495600000D000000 +082F477053772E63686B1175D3BB'); # NOTE: # Section name is a null terminated string. # Z* is setting the memory pointer after the # terminating null byte, but in the corresponding # variable the string is stored without the null (my $section, $mem) = unpack('Z* a*', $mem); print("$section\n"); print("-" x length($section), "\n\n"); # unpack and print # load id, offset, size, # path (length of path is "offset - 4") # and crc32 of each load entry while (length($mem)) { # NOTE: # The path is a variable string. And the length # of the string is not directly before the string. # If the length of the string would be directly # before the string then I could use the # length/string technique. But in this case I have # to use two unpacks. The first is to get the length # of the string (offset_to_next_load - 4). The second # unpack then has all information to read one # repetition of data in one swoop. # get length of path my $length_of_load_path = unpack('x2 n', $mem) - 4; # get data (my $load_id, my $offset_to_next_load, my $load_size, my $load_path, my $crc32_load_path, $mem) = unpack("n n N A$length_of_load_path N a*", $mem); print("Load ID : $load_id\n"); print("Offset to next load : $offset_to_next_load\n"); print("Load Size : $load_size\n"); print("Load Path : $load_path\n"); print("CRC32 of Load Path : $crc32_load_path\n\n"); }