in reply to Re^18: Net::OpenSSH loosing lines ins reply
in thread Net::OpenSSH loosing lines ins reply

write(1, "torm-control broadcast level 10."..., 4096) = 4096 write(1, "rol broadcast level 10.00\n1782: "..., 4096) = -1 EAGAIN (Re +source temporarily unavailable)

The only plausible explanation I can think of is OpenSSH ssh setting STDOUT into non-blocking mode. BTW, how are you capturing STDOUT?

Add a dump of /proc/$$/fdinfo:

open OUT, ">", "./trace.out"; for (0, 1, 2) { local $/; open my $fdinfo, '<', "/proc/$$/fdinfo/$_"; my $info = <$fdinfo>; print OUT "fdinfo $_:\n$info\n\n"; } my $line =0; foreach (@cmdout) { $line++; my $bytes = print $line.": ".$_; print OUT "$line: bytes: $bytes, err: $! \n"; } close(OUT);

The non-blocking flag is 0x4000. You may also like to generate dumps of fdinfo before the constructor call and before and after calling capture.

Replies are listed 'Best First'.
Re^20: Net::OpenSSH loosing lines ins reply
by Andy16 (Acolyte) on Jun 05, 2014 at 09:28 UTC
    Hi Salva,
    on a normal run:
    ============================== start fdinfo 0: pos: 0 flags: 0100002 fdinfo 1: pos: 0 flags: 01 fdinfo 2: pos: 0 flags: 01 ============================== after new fdinfo 0: pos: 0 flags: 0100002 fdinfo 1: pos: 0 flags: 01 fdinfo 2: pos: 0 flags: 01 ============================== after capture fdinfo 0: pos: 0 flags: 0100002 fdinfo 1: pos: 0 flags: 04001 fdinfo 2: pos: 0 flags: 04001


    for a failed run:
    ============================== start fdinfo 0: pos: 0 flags: 0100002 fdinfo 1: pos: 0 flags: 01 fdinfo 2: pos: 0 flags: 01 ============================== after new fdinfo 0: pos: 0 flags: 0100002 fdinfo 1: pos: 0 flags: 01 fdinfo 2: pos: 0 flags: 01 ============================== after capture fdinfo 0: pos: 0 flags: 0100002 fdinfo 1: pos: 0 flags: 04001 fdinfo 2: pos: 0 flags: 04001

    so the same....

    doesn't mean any thing to me... :-(

    I capture output using "tee" or even simpler only using "wc -l".
    For really using the script it would be called by another script and stdout would be read by caller.
      As I had supposed, after the capture call, the O_NONBLOCK flag is set.

      The ssh command that is being run under the hood by capture is leaving STDOUT in non-blocking mode. The issue seems fixed in newer versions of OpenSSH, or at least I am unable to reproduce it with the latest one (6.6.1p1).

      In any case it is easy to workaround. Just using capture2 instead of capture or setting stderr_discard => 1 should make it go.

      BTW, there isn't any difference on the fdinfo dumps between the successfully and failed invocations because the problem is hidden behind a race condition. The perl process needs to be faster writing than tee reading for the intermediate pipe buffer to fill.

        Hi Salva,

        GREAT


        changed to capture2 and problem did not reoccur up to now...

        Follwing your explanations - it shall be solved now!

        hero of my day!

        no chance for me digging that out....