Re^6: Error binmode() on unopened filehandle

Naw, you just need to set $/ appropriately. There is no real difference between

binmode $fh;
read($fh, my $buf, 20000);
[download]

and

binmode $fh;
local $/ = \20000;
my $buf = <$fh>;
[download]

Of course, both are junk. Why would you only read the first 20,000 bytes? The following make more sense:

binmode $fh;
my $file = '';
1 while read($fh, $file, 8*1024, length($file));
[download]

binmode $fh;
local $/;
my $file = <$fh>;
[download]

Comment on Re^6: Error binmode() on unopened filehandle Select or Download Code

Replies are listed 'Best First'.
Re^7: Error binmode() on unopened filehandle by Marshall (Canon) on May 07, 2020 at 00:09 UTC
Of course, both are junk. Why would you only read the first 20,000 bytes? There are a lot of scenarios where you might want to read the first part of a file without reading the whole file. I think there are some Unix file commands that read the first 1-2K of a file to determine if the file is text or binary? Perhaps I want to concatenate some big .WAV files together. There is some header info at the beginning of these files that needs to be interpreted. In the OP's question, this is a single .jpg and there is no reason to read the file in "hunks" because the image has to be processed as a single unit. However, other scenarios do exist. I do commend you for the choice of 8*1024 as buf size. That is a very good number with most file systems. Certain byte boundaries are important for the file system to work efficiently.	[reply]
Re^8: Error binmode() on unopened filehandle by ikegami (Patriarch) on May 07, 2020 at 18:57 UTC
Re "There are a lot of scenarios", Maybe, but the discussion at hand is about reading the entire file. I used 81024 because `read` reads in 8 KiB chunks anyway. $ perl -e'print "x" x 100_000' \ \| strace perl -e'read(\STDIN, my $buf, 100_000)' 2>&1 \ \| grep -P 'read\(0,' read(0, "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"..., 8192) = 8192 read(0, "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"..., 8192) = 8192 read(0, "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"..., 8192) = 8192 read(0, "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"..., 8192) = 8192 read(0, "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"..., 8192) = 8192 read(0, "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"..., 8192) = 8192 read(0, "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"..., 8192) = 8192 read(0, "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"..., 8192) = 8192 read(0, "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"..., 8192) = 8192 read(0, "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"..., 8192) = 8192 read(0, "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"..., 8192) = 8192 read(0, "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"..., 8192) = 8192 read(0, "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"..., 8192) = 1696 [download] But the parameter refers to the number of character to return, which could be different than the number of bytes read if an :encoding layer is used. So really, the number I picked is nothing to praise. If you want efficiency, it's probably best to use `sysread` with a very large number and decode afterwards.	[reply] [d/l] [select]
Re^9: Error binmode() on unopened filehandle by Marshall (Canon) on May 08, 2020 at 22:22 UTC
Thsnks for your interesting benchmark. That surprised me. The reason why 8K is "good"... The smallest unit of data that can be written to the disk is called a sector. For a bunch of historical and practical reasons, the most common value seen today is 512 bytes. There is no need for the file system to keep track of such a small unit. So the file system keeps track of blocks of sectors. An extremely common value of this smallest file system data unit is 8Kytes or 16 sectors. A combination of df or du commands can show this on a Unix system. Sorry don't have a Unix sys right now to post an example. If you write a file with one byte in it, it will take 8K of space on the disk. It is more efficient to just start out in the first place with a buffer size that will make the "file system happy" (increment of 8K). Bigger buffers typically help, but there are limits. I suspect not much to be gained once you are past 4*8192 bytes. Yes, sysread would have lower overhead. The OP's situation doesn't sound like any kind of performance issue.	[reply]
Re^10: Error binmode() on unopened filehandle by ikegami (Patriarch) on May 09, 2020 at 08:27 UTC
Re^11: Error binmode() on unopened filehandle by Marshall (Canon) on May 10, 2020 at 23:29 UTC