in reply to Re^5: Error binmode() on unopened filehandle
in thread Error binmode() on unopened filehandle

Once you go to BINMODE on a file handle, a record separator makes no sense.

I think I understand where you're coming from: when reading a binary file, it often makes more sense to use read instead of readline (aka <>), and I personally would probably use the former.

However, I also see several incorrect statements mixed in your nodes, like "Use of the DATA file handle is "special". Your initial premise that you could read binary data from the DATA file handle is wrong. That data will be in a character format." - this is wrong, see my node here.

I guess bottom line: Don't use <> bracket when reading binary files!

DATA is just another filehandle, and readline is not that magical, it can be used to read any filehandle (whether DATA, a binary file, etc.), as long as you pay attention to $/. For example, you can set $/ to a reference to an integer, and then readline will read "records" from the file, very much like read does.

Update:

print "this is Windows machine and I don't see both CR and LF characte +rs\n"; print "but I think that is due to Perl translation of line endings\n";

binmode turns off the CRLF to LF conversion, so if you're not seeing CRLF line endings (not sure how you determined that?) then that means the source file has only LF instead of CRLF line endings. Update 2: Hmm, see replies.

Minor edits for clarity.

Replies are listed 'Best First'.
Re^7: Error binmode() on unopened filehandle
by Marshall (Canon) on May 03, 2020 at 21:17 UTC
    That is interesting.

    I also would use read() for reading a binary file.

    binmode turns off the CRLF to LF conversion, so if you're not seeing CRLF line endings (not sure how you determined that?) then that means the source file has only LF instead of CRLF line endings.
    the Perl source file is written on Windows machine with CRLF line endings.
    n_bytes is 13, which is 2 short.
    I am a bit perplexed about that.
    This:

    my $data = <<EOF; first second EOF
    evidently deletes the <CR> characters.

    Update:

    use strict; use warnings; open (my $out, '>', "test_endings.txt") or die "$!"; print $out "first\n"; print $out "second\n"; close $out; open (my $in, '<', "test_endings.txt") or die "$!"; binmode $in; my $num_bytes = read ($in, my $buf, 20000); print "bytes read = $num_bytes\n"; ## prints 15 The <CR>'s are there in bin mode
      This:
      my $data = <<EOF; first second EOF
      evidently deletes the <CR> characters.

      Hmm, I'm quite surprised by that, and I'm still looking for the place where that's documented. Even trying to turn off the default :crlf layer on Windows doesn't seem to restore the CRLFs in $data. In addition, even on *NIX, eval "<<BAR\r\nx\r\ny\r\nBAR" causes the returned value to have only \n's, so it appears to be something to do with how heredocs are parsed. In fact, I've reported a bug.

        I've been away for awhile and I am answering posts in LIFO order.
        At the end of the day, I do not recommend using a here-doc like shown to generate a binary byte sequence. There are other ways that "for sure" will work.

        Of course a binary file doesn't always have printable ASCII characters.

          my $data = <<EOF;
          ...
          EOF

      evidently deletes the <CR> characters.

      As I understand it, in this particular case the CRs (carriage returns) are never there (in $data) to begin with. A here-doc is just another way to compose a string, in this case with double-quote interpolation (but that has no bearing here). Each line ends in a single  \n (newline) character.

      Writing such a line to a Windoze "text"-mode (i.e., non-binmode-ed) file causes CRs to be added. This can be seen with an "ordinary" string containing newlines that is written in "text" mode and then read back binmode-ed:

      c:\@Work\Perl\monks\Marshall>perl -wMstrict -e "use autodie; ;; use Data::Dump qw(dd); ;; my $s = qq{first\nsecond\n}; dd 's:', $s; print 'length: ', length $s, qq{\n}; ;; { open my $fh, '>', 'junque'; print $fh $s; close $fh; } ;; { open my $fh, '<', 'junque'; binmode $fh; my $t = do { local $/; <$fh>; }; dd 't:', $t; print 'length: ', length $t, qq{\n}; close $fh; } " ("s:", "first\nsecond\n") length: 13 ("t:", "first\r\nsecond\r\n") length: 15


      Give a man a fish:  <%-{-{-{-<

        As I understand it, in this particular case the CRs (carriage returns) are never there (in $data) to begin with. A here-doc is just another way to compose a string, in this case with double-quote interpolation (but that has no bearing here). Each line ends in a single \n (newline) character.

        Not exactly right...The <CR>'s are there in the source Windows file but, the here-doc strips the <CR>'s out. I personally recommend using binary read() instead of the <> operator for actual files. I guess we are talking here about how to imbed binary data into a Perl source file?

        Note that on old Mac's, the text line ending is <CR> instead of <CR><LF> or <LF>.