http://qs1969.pair.com?node_id=983244

joemaniaci has asked for the wisdom of the Perl Monks concerning the following question:

So I have been working on reverse engineering a file in hex that some genius thought it would be a great idea to write in both little endian and big endian. Anyway I came across what I believe to be a bug in perl but I wanted to make sure and so far I have my coworkers in league with me.

So I have these files that contain three sorts of records. There is the initial header that occurs only once, contains a totaly of 4096 bytes and is comprised of four byte floats and four byte int32s.

The second record is comprised of 36 bytes, also made up of four byte floats and four byte int32s. It appears to be precursor data to the third record type. It also has a variable that is used to state how many instances of the third record there should be. At which point another instance of this second record can follow as well as another set of third type records

The third record is comprised of a total of 28 bytes and contains(in order) five four-byte floats, one four-byte int32 and two two-byte shorts. The number of instances for this record is stored in record 2.

Now I have all of my unpacking/packing, swapping big to little endian, templates etc etc taken care of. My reading of a float for example is stored as a subroutine...

sub grabFloat { my $FourBytes = 4; my $floatTemp = 'f'; read(IN, my $record, $FourBytes); $record = reverse $record; my $value = unpack($floatTemp, $record); return $value; }

My code for grabbing my short is ...

sub grabInt16 { my $TwoBytes = 2; my $Int16 = 's'; read(IN, my $record, $TwoBytes); $record = reverse $record; my $value = unpack($Int16, $record); return $value; }

Int32s are pretty much the same as float except they are different endian and don't need to be reversed.

So here is the potential bug. I am reading thousands upon thousands of these records with out error. Then for some reason perl decides to skip a single byte. So let's look at record three.

Let's say I am reading the 305th record of the third type, so I should have five floats, one int32 and two int16s. Let's also say that we are starting with offset 0x1000

So what should we expect?

@ 0x1000 we should read 4 bytes for the first float

@ 0x1004 we should read 4 bytes for the Second float

@ 0x1008 we should read 4 bytes for the third float

@ 0x1012 we should read 4 bytes for the fourth float

@ 0x1016 we should read 4 bytes for the fifth float

@ 0x1020 we should read 4 bytes for the only int32

@ 0x1024 we should read 4 bytes for the first int16

@ 0x1026 we should read 4 bytes for the second int16

However, this is not what happens! and I am losing my mind. Here is what goes down...

@ 0x1000 we should read 4 bytes for the first float

@ 0x1004 we should read 4 bytes for the Second float

@ 0x1008 we should read 4 bytes for the third float

@ 0x1012 we should read 4 bytes for the fourth float

Now my next(final) float should be stored between bytes 0x1016 and 0x1019. However, what happens is that byte 0x1016 is discarded/skipped. So the float is now read between 0x1017 and 0x1020!!!!!!! So now everything from this point forward is shifted a byte. As you could see from the code above, I only read an even number of bytes, 2 or 4. If I was off by two I would believe that I made a mistake somewhere and read an extra int16 somewhere, but it is only a single byte! Now I have tried this script on multiple versions of these files with the exact same behavior every time. It occurs at different places for each file, but is at a consistent location for each individual file.

All the files I am working with are classified military files so I can't share. So I hope I was descriptive enough to point someone in the right direction.

I have verified this behavior many times and in many ways and running up to this bug I can print out what it looks like and this is essentially what I get...

print: 1.0 2.0 3.0 4.0 5.0 25 1 0

print: 1.1 2.2 3.3 4.15 5.35 26 2 0

print: 1.2 2.4 3.6 4.25 5.53 25 2 0

print: 1.3 2.6 3.9 4.0 2.58e-044 -7923652397.....