comment on

I know that, but I've already explained what it's irrelevant. There's no correspondence between the parameter passed to read and the amount that needs to be read from disk, so saying that reading 8192 bytes from disk at a time is a good idea doesn't make requesting 8192 characters from read a good idea.

Well the title of this node has "binmode()" explicitly in the title. Given that, my assumption that we are talking about binary data is not completely unfounded!

I was indeed quite surprised by your code at Re^8: Error binmode() on unopened filehandle.
I knew that your conclusion was incorrect. "I used 8*1024 because read reads in 8 KiB chunks anyway.". But at the time, I just wanted to hit the basics for other readers. I suspect part of the problem is 100_000 vs 1000000.? There are also typically limits to the size of the STDIN pipe.

Now I supply code that you can run on an actual disk file. Your test code is not representative of a real world example. Read() can indeed read more than 8192 bytes!

use strict;
use warnings;

# run on Windows 10 Home Edition

my $file ='COVID19-Death02Apr.jpg'; # any big file

open my $fh, '<', $file or die "$!";
binmode $fh;

my $data;
my $num_read = read ($fh, $data, 3*8192);

print $num_read;  # 24576  just fine!

__END__
open my $fh, '<', $file or die "$!";
binmode $fh;

This following open method means exactly the same as above:
open my $fh, '<:raw', $file or die "$!";
[download]

I have not done any benchmarking of read() using :raw encoding vs sysread(). I believe that the more direct sysread() method will be faster, but by how much? I do not know (and whether we are measuring CPU time or execution time) . read() adds an additional level of buffering even in :raw mode (or so I suspect). However, most UNIX versions also copy data to a system area before queuing the disk write for the hardware, i.e. the absolute memory pointer that the disk subsystem gets will not be in a user memory space. An extra copy may not matter much. The execution time to do a file copy with minimal processing will be dominated by disk system's ability to produce the "next blocks" and write them. There are mechanical motions involved in this and some extra CPU time may or may not matter that much depending upon what and how it is done.

In reply to Re^11: Error binmode() on unopened filehandle by Marshall
in thread Error binmode() on unopened filehandle by RedJeep

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.