I am trying to read data from a binary file. It consists of a sequence identifier followed by two sequences of unsigned integers. The format of this file is like this:
First 4 bytes is general information then the data comes in blocks like this:
next 4 bytes sequence identifier
next 2 bytes is the length y of the sequence
next y * 2 bytes is the first sequence
next 2 bytes is a separator with the second sequence
next y * 2 bytes is the second sequence
next 2 bytes is a separator with the following block
The number of these blocks varies and is not known when opening the file. I wrote some code to get the sequences out of the binary file

open(DATFILE, "<$datfilename") or die $!; binmode(DATFILE); read(DATFILE, $_, 4, 0); # Read 4 bytes of the general information foreach (0..110){ read(DATFILE, $_, 4, 0); # Read 4 bytes of the profile ID read (DATFILE, $_, 2, 0); # Read 2 bytes of the sequencelength &ReadData ($profilelength); # read the first sequence read (DATFILE, $_, 2, 0); # Read 2 bytes of the trailing zero &ReadData ($profilelength); # read the second sequence read (DATFILE, $_, 2, 0); # Read 2 bytes of the trailing zero }

This particular file has 111 data blocks and this codes works. The subroutine ReadData puts the data in an array. No problems here.
For the real thing I want to replace the foreach (0..110) by while <DATFILE> to keep reading until the eof since I do not know the number of blocks. When I do this the read behaviour changes. Instead of reading the expected byte number 5 when using foreach it starts reading at byte 15 when using while. This is within the sequence and that means the length of the sequence is wrong and the data that comes out is corrupt. Could any of the wise monks here kindly explain this while behaviour to me and perhaps a way to do it the proper way? Kind regards, Hans


In reply to while behaviour on binary files by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.