in reply to Optimizing binary file parser (pack/unpack)

Show some code, there might be some things that can be optimised.

If you have/can install Inline::C, it can speed processing binary records much quicker. Especially if you move optional fields logic into C.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
In the absence of evidence, opinion is indistinguishable from prejudice. Suck that fhit
  • Comment on Re: Optimizing binary file parser (pack/unpack)

Replies are listed 'Best First'.
Re^2: Optimizing binary file parser (pack/unpack)
by pwagyi (Monk) on Oct 04, 2017 at 01:54 UTC
    Here is pseudocode.

    The record fields can be either fixed or array. Fixed data type: Character, Integeral types (Unsigned short, Unsigned long), Float Variable data type: String(pascal style C/a), Array of Unsigned short, etc Optional fields at end of record can be omitted. So if there is a record with optional (Byte, C/a(String), Float), that needs to be handled somehow.

    read_file_header determine endian from file_header data set up unpack data types(for big/little endian) if(endian eq 'little') { $REAL_TEMPLATE = "R<"; $U2 = "n"; $REC_HEAD = ... } else { $U2 = "v"; $REC_HEAD = ... ...etc } while(1) { read (REC_HEAD) size data from file ($rec_len,$rec_type,....) = unpack($REC_HEAD) my $rec_body = read($rec_len) # a big switch on rec_type if($rec_type == FOO) { # unpack record_body somehow for THIS rec type # FOO can be # below 4 fields must be present in record body # fixed uSHORT,uLONG,uLONG,string(C/a), # below are optional # optional Byte(Optional onwards this field), Float,String my @data = unpack(" $uSHORT $uLONG $uLONG C/a",$rec_body); my $consumed_length = 10 + length($data[-1]) + 1; # ushort +2* +ulong + length(C/a) if($consumed_length < $rec_len) { # optional fields present push @data, unpack("x${consumed_length} C",$rec_body); $consumed_length += 1; } if($consumed_length < $rec_len) { push @data,unpack("x${consumed_length} $Float",$data); $consumed_length += 4; # float is 4 bytes } # next optional ..etc } elsif($rec_type == BAR) { } }