ikegami has asked for the wisdom of the Perl Monks concerning the following question:

I've always wondered if read() and sysread() can return less than the requested number of bytes, even in the absense of an error and without reaching the end of the file? perlfunc implies that they can, but I've seen code which assumes they don't.
  • Comment on Can read() return less than LENGTH bytes?

Replies are listed 'Best First'.
Re: Can read() return less than LENGTH bytes?
by Juerd (Abbot) on Aug 10, 2004 at 21:07 UTC

    There's no promise they cannot, so: yes. sysread calls read(2), so consult your platform's manual for details. Mine says:

    On success, the number of bytes read is returned (zero indicates end of file), and the file position is advanced by this number. It is not an error if this number is smaller than the number of bytes requested; this may happen for example because fewer bytes are actually available right now (maybe because we were close to end-of-file, or because we are reading from a pipe, or from a terminal), or because read() was interrupted by a signal. On error, -1 is returned, and errno is set appropriately. In this case it is left unspecified whether the file position (if any) changes.

    Juerd # { site => 'juerd.nl', plp_site => 'plp.juerd.nl', do_not_use => 'spamtrap' }

Re: Can read() return less than LENGTH bytes?
by dave_the_m (Monsignor) on Aug 10, 2004 at 21:05 UTC
    Yes, for example on a filehandle that's been set to non-blocking.

    Dave.

Re: Can read() return less than LENGTH bytes?
by fergal (Chaplain) on Aug 10, 2004 at 21:18 UTC
    If the file is set to be non-bocking, then when you read from it, it gives you as much as it can (up to LENGTH) but will return control back to the program immediately. This allows you to communiate on several network sockets at once for example. Without it you could get stuck waiting for LENGTH bytes to arrive.

    Also, on some Unixes, if a signal arrives then it can interrupt the system call and cause it to return prematurely (with $! set to EAGAIN which means that there was no error and you should try the call again). I tried to construct an example for this but it didn't work, possibly because perl handles signals in a reasonably robust manner and restarts the read for you automatically.

    That said, it seems that you don't need non-blocking IO or signals to get an example using sysread

    sysread(STDIN, $buffer, 1000); print "read ".length($buffer). " chars\n$buffer\n";
    when I type something in and hit enter the sysread returns immediately with far less than 1000 chars.
Re: Can read() return less than LENGTH bytes?
by ikegami (Patriarch) on Aug 10, 2004 at 21:19 UTC
    I have a follow up question! Will the following snippets always end with a newline or be 200 bytes long, if the end of the file isn't reached and there are no errors/signals?
    { local $/ = "\n"; $line = <FILE>; } { local $/ = \200; $record = <FILE>; }
    > cat > test.pl $/ = \10; $record = <STDIN>; print("[", length($record), "]\n"); > perl test.pl 1234 67890123 [10]
Re: Can read() return less than LENGTH bytes?
by GreyGlass (Sexton) on Aug 11, 2004 at 05:01 UTC
    Typically read(2) (i.e. sysread) will block until at least some data is available, if the descriptor is not set to non-blocking. Once the first byte is available, as many bytes, as are available, will be returned.

    In particular, on many systems socket buffers are fairly small (32K is typical). On such systems a sysread from a socket will never return more than 32K bytes, regardless of the size of the corresponding write to the socket.

    On my XP box this concept seems to be mixed: a sysread of 8MB (with a corresponding write on the client socket of the same size) terminates with 1K read, if the read is pending when the write is issued, but brings in the full 8MB, if the write is pending when the sysread is done.

    AFAIK, read will wait for the specified size, unless the underlying handle is set to non-blocking.