Non-blocking I/O woes

dwalin has asked for the wisdom of the Perl Monks concerning the following question:

Hello Wise Monks, I'm seeking your help again. I need to write a Perl script that would do several simple things:

Wait on STDIN for commands from the controller script, parse and execute these commands. There are only two of them at the moment: run, kill and quit. Run command accepts an arbitrary ID string and a script to pass on, ended with a dot - well, SMTP-like. Just like this:
```
run foo-bar-id
foo-foo-foo
bar-bar-bar
qux-qux-qux
.
[download]
```
The idea is that lines between #^run.*?$# and #\.# are passed on to the executed script. The kill <id> and quit commands are self-explanatory.
Upon receiving "run" command, my script should execute certain program in background, passing collected lines into its STDIN. The program then will stream out some text data in unbuffered fashion until killed (or something bad happens).
For all executed programs, my script should gather their output, prepend it with corresponding IDs and print to STDOUT. If you think that it looks suspiciously like multiplexing, you're right.
If an executed program dies in background, my script should silently restart it if it's not explicitly told not to by using "run nowatch <id>" command or somesuch.

Now that sounds easy, isn't it? I thought so, too, before I spent three days trying to understand the deep magic between pipes and signals. This is what I got so far:


#!/usr/bin/perl

use warnings;
use IO::Handle;
use IO::Select;
use File::Temp;
use Data::Dumper;

#use constant SCRIPT_PATH => '/path/to/obscure/program'; # It's always
+ there.

if (-x SCRIPT_PATH) {
  print "OK\n";
} else {
  print "FAIL No script found at " . SCRIPT_PATH . "\n";
  exit(1);
};

STDIN->blocking(0);
STDOUT->blocking(0);

my $io = IO::Handle->new_from_fd(fileno(STDIN), "r");
my $select = IO::Select->new($io);


my $buf = "";

$DEADER = 0;

sub Zombie_Alert {
  $DEADER = wait;
#  $SIG{CHLD} = \&Zombie_Alert;
};

$SIG{CHLD} = \&Zombie_Alert;

%CHILDREN = ();

KEEPEMCOMIN: while (my @ready = $select->can_read(undef)) {
  foreach my $handle (@ready) {
    my $line = ""; my $bufline = "";

    my $rv = $handle->sysread($bufline, 4096);
    $line .= $bufline;
    while ($rv == 4096) {

      $rv = $handle->sysread($bufline, 4096);
      $line .= $bufline unless !$rv;
    };

    if ($handle->fileno == fileno(STDIN)) { # process commands from ST
+DIN
      $buf .= $line;
      next KEEPEMCOMIN if $buf !~ /(^(run).*?^\.$)|(^(kill|quit).*?$)/
+sim;

      $buf =~ s#\r\n#\n#g;

YESSIR: while ($buf =~ /(^run.*?^\.$)|(^kill.*?$)|(^q(uit)?$)/gsim) {
        my $lbuf; $lbuf = $1 if $1; $lbuf = $2 if !$1 && $2;
        $lbuf = $3 if !$1 && !$2 && $3; my $savebuf = $lbuf;
        $lbuf =~ s#^\.\n##g;

        if ($lbuf =~ s/(^run.*?^)//sim) { $cmd = $1; chomp $cmd; };

        if ($cmd =~ /^run(\s+nowatch)?\s+(.*?)$/i) { # RUN command
          my $id = $2;
          my $watch = $1 && $1 =~ /nowatch/i ? 0 : 1;

          if (!$id) { print "FAIL No id specified\n"; next; };

         print spawn($id, $watch, $lbuf), "\n";
        } elsif ($cmd =~ /^kill\s+(.*?)$/i) {        # KILL command
          my $id = $1;

          if (my $pid = $CHILDREN{'id'.$id}) {
            my $fd = $CHILDREN{'id'.$id}{'io'}->fileno;

            kill SIGTERM, $pid;
            $select->remove($fd);
            undef $CHILDREN{'id'.$id}{'io'};
            delete($CHILDREN{'id'.$id});
            delete($CHILDREN{'pid'.$pid});
            delete($CHILDREN{'fd'.$fd});

            print "OK\n";
          } else {
            print "FAILED $2 No such id\n";
          };
        } elsif ($cmd =~ /^quit$/i) {   # QUIT command
          $SIG{CHLD} = 'IGNORE';
          foreach my $key (keys %CHILDREN) {
            next unless $key =~ /pid\d+/;
            kill SIGTERM, $CHILDREN{$key}{'pid'};

          };
          print "OK\n";
          exit(0);
        };
        $buf =~ s/$savebuf// unless !$savebuf;
        $buf =~ s/^\s*//g;
        next YESSIR if $buf;
      };
    } else {                          # process input from children
      local $id = $CHILDREN{'fd'.$handle->fileno}{'id'} . " ";
      foreach my $chunk (split /\n/, $line) { print "${id}${chunk}\n";
+ };
    };
  };
  if ($DEADER > 0) {   # ZOMBIE ALERT!
      if (exists($CHILDREN{'pid'.$DEADER})) {
        my $id = $CHILDREN{'pid'.$DEADER}{'id'};
        my $pid = $CHILDREN{'pid'.$DEADER}{'pid'};
        my $io = $CHILDREN{'pid'.$DEADER}{'io'};
        my $watch = $CHILDREN{'pid'.$DEADER}{'watch'};
        my $buf = $CHILDREN{'pid'.$DEADER}{'buf'};

        delete($CHILDREN{'id'.$id});
        delete($CHILDREN{'pid'.$pid});

        delete($CHILDREN{'fd'.$io->fileno});
        $select->remove($io->fileno);
        undef $io;

        if ($watch) {
          $ret = spawn($id, $watch, $buf);
          if ($ret =~ /FAIL/) { print "$ret\n"; }
        };
      } else { print "FAIL Unknown id died\n"; };
    $DEADER = 0;
    next KEEPEMCOMIN;
  };
};

exit(0);

$HANDLE = 1; # initial file handle number
$TMPNAME = undef; # temp file name

sub spawn {
  my ($id, $watch, $buf) = @_;

  unlink $TMPNAME unless !$TMPNAME;

  ($tmph, $TMPNAME) = File::Temp::tempfile();
  print $tmph $buf, "\n";
  close($tmph);

  $HANDLE++;

  if (my $pid = open('CHILD'.$HANDLE, "-|")) {    # parent process her
+e
    my $rd = IO::Handle->new_from_fd(fileno('CHILD'.$HANDLE), "r");
    $select->add($rd);

    my %kid;
    $kid{'id'} = $id;
    $kid{'pid'} = $pid;
    $kid{'io'} = $rd;
    $kid{'watch'} = $watch;
    $kid{'buf'} = $buf;

    $CHILDREN{'id'.$id} = \%kid;
    $CHILDREN{'pid'.$pid} = \%kid;
    $CHILDREN{'fd'.$rd->fileno} = \%kid;

    return "RUNNING $id";
  } elsif (defined($pid)) {  # child process here
    $SIG{INT} = $SIG{TERM} = $SIG{CHLD} = 'DEFAULT';
    open(STDIN, "<$TMPNAME");
    select STDOUT; $| = 1;
    exec(SCRIPT_PATH) or return "FAIL $id Cannot exec: $!";
  } else {
    return "FAIL $id Cannot fork: $!";
  };
};
[download]

And no, it ain't working as expected. There are several problems:

The $select->can_read() does not block, despite what documentation says. It keeps reporting that there's something in STDIN, even if there's nothing there for sure - thus entering a tight loop and hogging up CPU. Inserting a "sleep 1" right after it helps of course, but that's ugly kludge.
SIGCHLD is killing me. I cannot understand why it works differently depending on where my STDIN lies: if I run the script like "myscript < command_file" it appears to be working OK but if I run it in interactive mode and type (or paste) commands manually, it doesn't! It just silently goes to that exit(0) at the end of main script and exits! I think I tried all debugging methods Internet knows of and I still haven't got a clue WHY it does that. No error indication, no exceptions, no nothing - it just goes right out of the main loop right after signal handler! I'm at loss.
The script's behavior is erratic and bizarre, to say at least. One time I run it and it appears to work - another one it just sits there silently, and nothing happens. Black magic?

Frankly, I don't know what to think. A seemingly easy exercise in Perl Cookbook programming turned out to be a complete nightmare. And I don't even have a clue where it went so wrong... Hope somebody could shed a couple photons on this issue, I'd really appreciate it.

Oh, and there are limitations as well: this script will run under Solaris 9 using perl-5.6.1 provided with the system, and no non-standard modules are allowed. Ideally, this script shouldn't even be installed but run on-the-fly with perl -x after sshing on the box.

Thanks in advance for any input!

Regards,
Alex.

Comment on Non-blocking I/O woes Select or Download Code

Replies are listed 'Best First'.
Re: Non-blocking I/O woes by zentara (Cardinal) on Jan 10, 2011 at 18:38 UTC
The $select->can_read() does not block, despite what documentation says. It keeps reporting that there's something in STDIN Have you considered simplifying your code by using an event-loop system? For instance, can you use a simple program like this? #!/usr/bin/perl use warnings; use strict; use Glib; my $main_loop = Glib::MainLoop->new; Glib::IO->add_watch (fileno 'STDIN', [qw/in/], \&watch_callback, 'STDI +N'); #just to show it's non blocking my $timer1 = Glib::Timeout->add (1000, \&testcallback, undef, 1 ); $main_loop->run; sub watch_callback { # print "@_\n"; my ($fd, $condition, $fh) = @_; my $line = readline STDIN; print $line; #always return TRUE to continue the callback return 1; } sub testcallback{ print "\t\t\t".time."\n"; } __END__ [download] POE is another eventloop alternative with an nice POE Cookbook See the section on ProcessManagement I'm not really a human, but I play one on earth. Old Perl Programmer Haiku ................... flash japh	[reply] [d/l]
Re^2: Non-blocking I/O woes by dwalin (Monk) on Jan 10, 2011 at 19:14 UTC
Thanks for advice, I'll consider using it as last resort. I need to avoid any extra modules and/or software on that box since that'd break vendor maintenance conditions and you know how anal they're going to be about that. Even more, if I can make this script work it'll probably run on scores of customers' boxes, making maintenance headaches unbearable...	[reply]
Re: Non-blocking I/O woes by ikegami (Patriarch) on Jan 10, 2011 at 23:34 UTC
The $select->can_read() does not block, despite what documentation says. It keeps reporting that there's something in STDIN, even if there's nothing there for sure Define "nothing", `sysread` returning false? Then the handle is closed due to error (undef) or due to eof (zero). You need to handle those conditions. You are calling `select` (via `can_read`) on a handle that can't possibly ever return data. By the way, the following is buggy: `my $rv = $handle->sysread($bufline, 4096); $line .= $bufline; while ($rv == 4096) { $rv = $handle->sysread($bufline, 4096); $line .= $bufline unless !$rv; }` [download] It'll block if there's exactly 4096 bytes waiting. Remove the loop completely. You need to rely on your `select` (`can_read`) loop. What follows in the code should just restart the `select` loop unless it detects that it received a full message. This requires moving your handle's buffer (`$line`) outside of the loop, of course. In other words, the code should follow the following pattern: `- While there are handles from which to read, - Wait for data to arrive. - Read into the handle's buf. -> Don't forget to handle eof and error. - While the buf has a full command, - Remove the command from the buf. - Process the command.` [download] By the way, `my $bufline = ""; my $rv = $handle->sysread($bufline, 4096); $line .= $bufline;` [download] can be written as `my $rv = $handle->sysread($line, 4096, length($line));` [download]	[reply] [d/l] [select]
Re: Non-blocking I/O woes by juster (Friar) on Jan 11, 2011 at 04:59 UTC
I think `$select->can_read()` is blocking but not in the way you are expecting. `can_read` will return a handle when the end of file is reached as well. A feature I did not expect is that `can_read` will return an empty list, the blocking abruptly stopped, when a signal is caught by a signal handler. This means, after the SIGCHLD handler is executed the code execution returns from `can_read` with an empty list. This empty list means the end of your `while` loop and script. Inside your spawn sub, when you are forking by using `open` you are not exiting the forked child process. The child will merrily return from the spawn sub and act as a proud new parent. Two forked processes reading the same STDIN is a recipe for bizarre behavior. I think it would also be easier to use `open` to call the "script" program directly instead of forking and then calling the script with `exec`. If you use `open` to call the external program then you don't need to worry about exiting the forked process after `exec`. Here is a version I messed with tonight. IPC is always interesting... Read more... (5 kB) edit: wrapped code in a readmore tag	[reply] [d/l] [select]
Re^2: Non-blocking I/O woes by ikegami (Patriarch) on Jan 11, 2011 at 05:58 UTC
`my $line = <STDIN> \|\| q{};` You introduced bugs by moving away from `sysread`. First, you've introduced buffered IO. How can the `select` system call know that Perl has data waiting to be read in a buffer? Second, you're blocking until a newline is received, and that could be an indefinitely long wait.	[reply] [d/l] [select]
Re^3: Non-blocking I/O woes by juster (Friar) on Jan 11, 2011 at 15:41 UTC
The bug doesn't happen because perl has STDIN line-buffered. Select won't notice STDIN has read data until you press enter. This also means there will always be a newline to read for `<STDIN>`. `sysread` will give you the unbuffered activity you expect but `select` will not. I think this is where you are mistaken. This means select will still be blocking until you press enter. I was able to get select to trigger on a single character by using `tcsetattr` to disable line-buffering (aka canonical mode) for the terminal. (code at the end of the post) edit: Just to be clear about what this implies, this leads me to believe that line-buffering happens at a lower level than the perl interpreter. This answers your rhetorical question: How can the select system call know that Perl has data waiting to be read in a buffer? It is my hypothesis that the buffer you speak of happens at the system level and not inside the perl interpreter. Using stdio.h functions like `getc` and `getline`, you must read input through the terminal's internal buffer while in canonical mode. The `read` system call (via perl's `sysread`) does not use this higher-level terminal line buffer. I have no idea how this works in windows. #!/usr/bin/env perl use warnings; use strict; use IO::Select; use IO::Handle; use Inline 'C'; my $select = IO::Select->new( \STDIN ); disable_canon_mode() if @ARGV; # Experiment by commenting out the above line and giving command-line # arguments. while ( 1 ) { $select->can_read; # block until new input warn "CAN_READ STDIN\n"; my $in; if ( @ARGV ) { sysread STDIN, $in, 1; print "Char read: $in\n"; } else { $in = <STDIN>; print "Line read: $in\n"; } } __END__ __C__ #include <termios.h> #define STDIN 0 void disable_canon_mode () { struct termios tio; if ( isatty( STDIN ) == 0 ) { return; } / Turn off canonical mode, meaning read one char at a time. Set timer to 0, to wait forever. Minimum chars is 1. */ tcgetattr( STDIN, &tio ); tio.c_lflag &= ~ICANON; tio.c_cc[ VTIME ] = 0; tio.c_cc[ VMIN ] = 1; tcsetattr( STDIN, TCSANOW, &tio ); } [download]	[reply] [d/l] [select]
Re^4: Non-blocking I/O woes by ikegami (Patriarch) on Jan 11, 2011 at 19:04 UTC
Re^2: Non-blocking I/O woes by ikegami (Patriarch) on Jan 11, 2011 at 05:51 UTC
A feature I did not expect is that can_read will return an empty list, the blocking abruptly stopped, when a signal is caught by a signal handler. `select` returns on error, and that includes interruption by signal (EINTR).	[reply] [d/l]
Re^2: Non-blocking I/O woes by dwalin (Monk) on Jan 12, 2011 at 16:23 UTC
Thanks! That was it - somehow I never expected can_read() to return with an empty handle array upon receiving a signal. But it does and it really explains everything: the silent exit and the erratic behavior and all that. Now that I got this mystery solved I'm happy and productive again. :)	[reply]