in reply to Re^3: Error binmode() on unopened filehandle
in thread Error binmode() on unopened filehandle

I disagree. What about this:

#!/usr/bin/perl use strict; use warnings; my $data = <<EOF; first second EOF open my $fh, '<', \$data; binmode $fh; my $binary = <$fh>; print "$binary\n";

Greetings,
-jo

$gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$

Replies are listed 'Best First'.
Re^5: Error binmode() on unopened filehandle
by Marshall (Canon) on May 03, 2020 at 15:41 UTC
    Ok, spent some more time fiddling with this. I now see what you mean (code attached).
    I have never read a binary file with the <> angle operator. That idea would have never have occured to me. I have always used a read() specifying the num of bytes to read as shown below. This implies that although this "angle read" appeared to work in my .jpg example, there could be some CRLF sequence in the data that would cause this .jpg read to fail. I find that interesting to know. Good point.

    I guess bottom line: Don't use <> bracket when reading binary files!

    #!/usr/bin/perl use strict; use warnings; my $data = <<EOF; first second EOF print "data var in text mode - this works...\n"; print "$data\n"; print "----\n"; open my $fh, '<', \$data; binmode $fh; my $num_bytes = read ($fh, my $buf, 20000); print "read () binary doesn't completely work..the normal way to read +binary\n"; print "this is Windows machine and I don't see both CR and LF characte +rs\n"; print "but I think that is due to Perl translation of line endings\n"; print "bytes read = $num_bytes\n"; print '',$buf; print "----\n"; print "using angle operator for binary read doesn't work\n"; print "I've never tried this before and I'm not sure why\n"; print "this doesn't work - need explanation of the angle <>op.\n"; close $fh; open $fh, '<', \$data or die "$!"; binmode $fh; my $bdata = <$fh>; print '',$bdata; __END__ data var in text mode - this works... first second ---- read () binary doesn't completely work..the normal way to read binary this is Windows machine and I don't see both CR and LF characters but I think that is due to Perl translation of line endings bytes read = 13 first second ---- using angle operator for binary read doesn't work I've never tried this before and I'm not sure why this doesn't work - need explanation of the angle <>op. first
      Once you go to BINMODE on a file handle, a record separator makes no sense.

      I think I understand where you're coming from: when reading a binary file, it often makes more sense to use read instead of readline (aka <>), and I personally would probably use the former.

      However, I also see several incorrect statements mixed in your nodes, like "Use of the DATA file handle is "special". Your initial premise that you could read binary data from the DATA file handle is wrong. That data will be in a character format." - this is wrong, see my node here.

      I guess bottom line: Don't use <> bracket when reading binary files!

      DATA is just another filehandle, and readline is not that magical, it can be used to read any filehandle (whether DATA, a binary file, etc.), as long as you pay attention to $/. For example, you can set $/ to a reference to an integer, and then readline will read "records" from the file, very much like read does.

      Update:

      print "this is Windows machine and I don't see both CR and LF characte +rs\n"; print "but I think that is due to Perl translation of line endings\n";

      binmode turns off the CRLF to LF conversion, so if you're not seeing CRLF line endings (not sure how you determined that?) then that means the source file has only LF instead of CRLF line endings. Update 2: Hmm, see replies.

      Minor edits for clarity.

        That is interesting.

        I also would use read() for reading a binary file.

        binmode turns off the CRLF to LF conversion, so if you're not seeing CRLF line endings (not sure how you determined that?) then that means the source file has only LF instead of CRLF line endings.
        the Perl source file is written on Windows machine with CRLF line endings.
        n_bytes is 13, which is 2 short.
        I am a bit perplexed about that.
        This:

        my $data = <<EOF; first second EOF
        evidently deletes the <CR> characters.

        Update:

        use strict; use warnings; open (my $out, '>', "test_endings.txt") or die "$!"; print $out "first\n"; print $out "second\n"; close $out; open (my $in, '<', "test_endings.txt") or die "$!"; binmode $in; my $num_bytes = read ($in, my $buf, 20000); print "bytes read = $num_bytes\n"; ## prints 15 The <CR>'s are there in bin mode
      I guess bottom line: Don't use <> bracket when reading binary files!

      I don't see a problem with this, as long as you use binmode and local $/

      Greetings,
      -jo

      $gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$

      Naw, you just need to set $/ appropriately. There is no real difference between

      binmode $fh; read($fh, my $buf, 20000);
      and
      binmode $fh; local $/ = \20000; my $buf = <$fh>;

      Of course, both are junk. Why would you only read the first 20,000 bytes? The following make more sense:

      binmode $fh; my $file = ''; 1 while read($fh, $file, 8*1024, length($file));
      or
      binmode $fh; local $/; my $file = <$fh>;
        Of course, both are junk. Why would you only read the first 20,000 bytes?

        There are a lot of scenarios where you might want to read the first part of a file without reading the whole file. I think there are some Unix file commands that read the first 1-2K of a file to determine if the file is text or binary? Perhaps I want to concatenate some big .WAV files together. There is some header info at the beginning of these files that needs to be interpreted. In the OP's question, this is a single .jpg and there is no reason to read the file in "hunks" because the image has to be processed as a single unit. However, other scenarios do exist.

        I do commend you for the choice of 8*1024 as buf size. That is a very good number with most file systems. Certain byte boundaries are important for the file system to work efficiently.

Re^5: Error binmode() on unopened filehandle
by Marshall (Canon) on May 03, 2020 at 14:39 UTC
    I don't understand the point that you are trying to make. You open a file handle to a Perl var. That's fine. You set binmode before you read from that file handle and that's fine too.

    Find some .jpg file you have somewhere and try the code that I posted. Use of the DATA file handle is "special".

    Your initial premise that you could read binary data from the DATA file handle is wrong. That data will be in a character format.

    Update: Here is a Perl program that reads and prints itself. DATA is an already read and opened file handle.

    use warnings; use strict; print "testing seek of DATA handle\n"; print "this will print this program\n"; seek (DATA,0,0); my $text = do{ local $/ = undef; <DATA>; }; print $text; __DATA__ asdfasdf asdfasdf

      My last example was without the special DATA file handle and shows that my $binary = <$fh> will read the data from $fh up to and including the first appearance of $/. Using binmode on a file handle does not change this behaviour.

      You may check if:

      • your data really does not contain a record separator
      • the generated base64 data, when decoded, resembles the original file

      EDIT: Here is another:

      #!/usr/bin/perl use strict; use warnings; use constant RANDFILE => "rand.dat"; system "dd if=/dev/urandom of=" . RANDFILE . " bs=1k count=64"; open my $fh, '<', RANDFILE or die; binmode $fh; my $binary = <$fh>; print "got: ", length($binary), "\n";

      Greetings,
      -jo

      $gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$

      Your initial premise that you could read binary data from the DATA file handle is wrong.

      Works for me.

      use feature qw( say ); binmode DATA if $ARGV[0]; while (<DATA>) { say sprintf "%v02X", $_; } __DATA__ abc def
      >perl a.pl 0 61.62.63.0A 64.65.66.0A >perl a.pl 1 61.62.63.0D.0A 64.65.66.0D.0A