in reply to Re^3: Unexpected result using tell/seek within the __DATA__ file
in thread Unexpected result using tell/seek within the __DATA__ file

You took that out of context. There is no contradiction. You were the one who implied the program file had mixed newline types. I only got out the hex editor to rule out the possibility. All the newlines in the program file have 2 bytes as I stated already.

The offsets I'm referring to in the original post are the offsets shown in the program output. 468 480 493 505 are the offset positions from the array that was created from "tell DATA". By taking the differences between sucessive numbers I could see that adding a newline in the DATA section changed the difference from 12 to 13. 493-480=13; 480-468=12

Adding a new line in the Main Code caused all offsets displayed in the program output to be shifted by 2.

Update: I just got the hex editor out again, all the dust has been brushed off now. The positions of the next character after each of the __DATA__ lines is as follows:

from Hex edit From program output in OP 0x1D6 => 470 # 470 (the same) 0x1E4 => 484 # 482 different, just like I said. 0x1F4 => 500 # 495 0x202 => 514 # 507

It confirms what I've been saying all along.

Replies are listed 'Best First'.
Re^5: Unexpected result using tell/seek within the __DATA__ file
by ikegami (Patriarch) on Mar 12, 2011 at 00:36 UTC

    So what are the 12 bytes vs the 13 bytes, using the hex editor?

    If you're using CRLF as you say, you would get 14 and 16.

    1 2 3 4 5 6 7 8 9 10 11 12 13 14 "a" "b" CR LF "_" "_" "D" "A" "T" "A" "_" "_" CR LF "a" "b" CR LF CR LF "_" "_" "D" "A" "T" "A" "_" "_" CR LF 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

    None of PerlIO, clib and the OS know anything of __DATA__, so they don't reporting the file positions different after the first __DATA__ is encountered.

    I didn't take anything out of context. You said you get a shift of 2 before DATA and a shift of 1 (13-12) after DATA. Yet you say the difference in the file is two characters (CR LF) in both cases. Those statements are contradictory.

      Have a second look at my previous post. I did get 14 and 16 with the hex editor.

      None of PerlIO, clib and the OS know anything of __DATA__, so they don't reporting the file positions different after the first __DATA__ is encountered.

      Special Literals

      Text after __DATA__ may be read via the filehandle PACKNAME::DATA , where PACKNAME is the package that was current when the __DATA__ token was encountered. The filehandle is left open pointing to the contents after __DATA__. It is the program's responsibility to close DATA when it is done reading from it.

      Correct me if I'm wrong but the way I understand it is Perl has to compile and preprocess the program file. It reads the file until EOF or __END__ or __DATA__. According to the above section DATA is left open pointing to just after __DATA__. That means when you do a "tell DATA" like I did at the beginning of the script there is nothing for tell() to calculate: It just has to spit out the file location. Then, after reading from DATA if you do another tell DATA it has to do the math.

      I just ran the script on Knoppix linux and this problem didn't show up. The seek went to the correct location and there were no partial lines printed.

      I'm sorry that I couldn't explain about the offsets better. I know what I meant and it makes sense to me. I thought it was an interesting problem that others would be interested in or I wouldn't have posted it.

      I've had a splitting headache all day and now that I've had some Tylenol and a snack I'm off to relax a little.

        I can confirm the bug (ActivePerl 5.12.1). It doesn't happen with PERLIO=:crlf on a linux machine. (5.12.2, default build config).

        Here's a good demonstration:

        #!/usr/bin/perl use strict; use warnings; my @data_positions = tell(DATA); while (<DATA>){ if (/^__DATA__$/) { push @data_positions, tell(DATA); } } my @fh_positions; open(my $fh, '<', $0) or die; while (<$fh>){ if (/^__DATA__$/) { push @fh_positions, tell($fh); } } print("@data_positions\n"); print("@fh_positions\n"); __DATA__ ab __DATA__ ab __DATA__ ab __DATA__ lotsa junk nothing
        389 401 414 426 389 403 419 433