http://qs1969.pair.com?node_id=1173814

vsespb has asked for the wisdom of the Perl Monks concerning the following question:

Often I have a task to read/write data to/from pipes without perl buffered IO (because same pipes used in select()). AFAIK this should be implemented via sysread/syswrite, and sysread/syswrite are documented to sometimes return less data, than available and sometimes return EINTR (let's talk about blocking pipes for now). So every time I end up with the following wrappers:
# args: ($file, $buffer, $length) # returns data in $buffer and number of bytes read, or undef on EOF sub sysreadfull { my ($file, $len) = ($_[0], $_[2]); my $n = 0; while ($len - $n) { my $i = sysread($file, $_[1], $len - $n, $n); if (defined($i)) { if ($i == 0) { return $n; } else { $n += $i; } } elsif ($!{EINTR}) { redo; } else { return $n ? $n : undef; } } return $n; } # args: ($file, $buffer) # returns number of bytes actually written sub syswritefull { my ($file, $len) = ($_[0], length($_[1])); my $n = 0; while ($len - $n) { my $i = syswrite($file, $_[1], $len - $n, $n); if (defined($i)) { $n += $i; } elsif ($!{EINTR}) { redo; } else { return $n ? $n : undef; } } return $n; }
Question is: It's should be very common problem for everyone who deals with IPC or network programming, why nobody wrote CPAN module with such wrappers? Maybe because there is easier way ? Maybe ":raw" handles? I saw similar code only here (but for non-blocking), but seems it has bug

Replies are listed 'Best First'.
Re: sysread/syswrite wrappers
by BrowserUk (Patriarch) on Oct 12, 2016 at 13:32 UTC
    why nobody wrote CPAN module with such wrappers?

    My take on is that sysread is usually only used for non-blocking reads; and then usually in code dealing with multiple input streams.

    In this usual case, you definitely don't want to block waiting for the rest of the expected input from any given stream, because it might be a long time coming; or indeed, never come.

    Rather you want the read to return ASAP -- partial or not -- so that you can utilise any time that your wrapper would spend 'blocking' on one stream, checking other streams for input.

    You say you are using sysread because you are using select; and presumably you are using select because you want to be doing other things whilst waiting for data to become available. What are those other things?

    And what is that makes it important to be doing them, whilst waiting for data to start to arrive, not so important that you can hold off doing them indefinitely, if the pipe only gives you part of what you are expecting and never completes?


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Ok, your point very similar to answer above.
      Probably my case limited to IPC communications then. It's pipes, localhost only.
      I don't see reason to do non-blocking reads then. If data sending process is write whole message at once, say with
      syswritefull($fh, sprintf("%08d", length($line))) && syswritefull($fh, $line)
      it will be read by receiving process very fast (unless whole system unresponsive/swapping), and if sending process crashed/dies, there will be eof. Blocking pipes and reading whole message at once looks ok to me.
      With blocking pipes you can still select() between them to determine which process sent you next message to read.
        Blocking pipes and reading whole message at once looks ok to me.

        Then why sysread/syswrite? You'll say because you need select; but the only reason for using sysread/syswrite with select is because they don't wait for complete input.

        If you're confident that every read will be a complete message, just use readline & print and have done with it.

        But, think on this, pipes are just buffers, usually 4K at each end; and data is not passed to the reading process until a full 4K is available. (And setting line buffering won't change that.)

        This is easily demonstrated. The following code writes 122 byte lines every tenth of a second, but you will see no output for 3.35 seconds because that's how long it takes to fill the 4k buffer. It then produces batches of lines every 3.35 seconds until the writer closes the pipe:

        perl -E"select('','','',0.1),say 'test'x30 for 1 ..200" | perl -ple1

        And if your messages are not some exact multiple of 4K, the last line of every 4k block will be a partial message, and your wrapper will therefore block until the next 4k block has been filled and passed through, before that message will be completed.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: sysread/syswrite wrappers
by choroba (Cardinal) on Oct 12, 2016 at 11:57 UTC
    Reminds me of my struggles with A Game Using TCP Sockets. I haven't seen the code fore several years, so it might me only tangentially relevant, I definitely don't remember using EINTR.

    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
      Yep, that your post links to my post written at y2010. Well, that time it was surprise for me that I should use syswrite and cannot use, say, eof(), but now it's not surprise, I know how to workaround this and everything works on production and well tested. But now I wondering why nobody else inventing wrappers for sysread and maybe there is simplier way..
Re: sysread/syswrite wrappers (select)
by tye (Sage) on Oct 12, 2016 at 13:08 UTC

    But doing this thwarts the usual reasons for using select on said pipes. Your second call to sysread/syswrite might hang. So, no, I have never written nor used such a wrapper.

    - tye        

      Ok, technically you are right, that is good answer to my question.
      Usually in my case messages are small and "rare".
      i.e. there is on master process, which receives messages from child processes. And select() loop in master process. But when master determines that there is the message from some child, it tries to read full message, without trying to multiplex between two childs and reading message from both at time.
      So, after select() told filehandle is readable, I read whole message from child, and during that process do not select() other handles. Child process (tries) to write whole message at once and does not do heavy job in between. It's also a pipe, i.e. on localhost.
      Even if there is 100 children, and they do actual heavy job, and send 100-bytes message to master once a second, it's just 100 messages perl second.
      That approach of course, is not good when you're writing thing like http proxy (like nginx) with slow clients, when multiplexing between clients directly affects performance.
      So for things like http proxies (like nginx) there should be more complicated solution.