dwalin has asked for the wisdom of the Perl Monks concerning the following question:

Hello Wise Monks, I'm seeking your help again. I need to write a Perl script that would do several simple things:

  1. Wait on STDIN for commands from the controller script, parse and execute these commands. There are only two of them at the moment: run, kill and quit. Run command accepts an arbitrary ID string and a script to pass on, ended with a dot - well, SMTP-like. Just like this:

    run foo-bar-id foo-foo-foo bar-bar-bar qux-qux-qux .

    The idea is that lines between #^run.*?$# and #\.# are passed on to the executed script. The kill <id> and quit commands are self-explanatory.

  2. Upon receiving "run" command, my script should execute certain program in background, passing collected lines into its STDIN. The program then will stream out some text data in unbuffered fashion until killed (or something bad happens).

  3. For all executed programs, my script should gather their output, prepend it with corresponding IDs and print to STDOUT. If you think that it looks suspiciously like multiplexing, you're right.

  4. If an executed program dies in background, my script should silently restart it if it's not explicitly told not to by using "run nowatch <id>" command or somesuch.

Now that sounds easy, isn't it? I thought so, too, before I spent three days trying to understand the deep magic between pipes and signals. This is what I got so far:

#!/usr/bin/perl use warnings; use IO::Handle; use IO::Select; use File::Temp; use Data::Dumper; #use constant SCRIPT_PATH => '/path/to/obscure/program'; # It's always + there. if (-x SCRIPT_PATH) { print "OK\n"; } else { print "FAIL No script found at " . SCRIPT_PATH . "\n"; exit(1); }; STDIN->blocking(0); STDOUT->blocking(0); my $io = IO::Handle->new_from_fd(fileno(STDIN), "r"); my $select = IO::Select->new($io); my $buf = ""; $DEADER = 0; sub Zombie_Alert { $DEADER = wait; # $SIG{CHLD} = \&Zombie_Alert; }; $SIG{CHLD} = \&Zombie_Alert; %CHILDREN = (); KEEPEMCOMIN: while (my @ready = $select->can_read(undef)) { foreach my $handle (@ready) { my $line = ""; my $bufline = ""; my $rv = $handle->sysread($bufline, 4096); $line .= $bufline; while ($rv == 4096) { $rv = $handle->sysread($bufline, 4096); $line .= $bufline unless !$rv; }; if ($handle->fileno == fileno(STDIN)) { # process commands from ST +DIN $buf .= $line; next KEEPEMCOMIN if $buf !~ /(^(run).*?^\.$)|(^(kill|quit).*?$)/ +sim; $buf =~ s#\r\n#\n#g; YESSIR: while ($buf =~ /(^run.*?^\.$)|(^kill.*?$)|(^q(uit)?$)/gsim) { my $lbuf; $lbuf = $1 if $1; $lbuf = $2 if !$1 && $2; $lbuf = $3 if !$1 && !$2 && $3; my $savebuf = $lbuf; $lbuf =~ s#^\.\n##g; if ($lbuf =~ s/(^run.*?^)//sim) { $cmd = $1; chomp $cmd; }; if ($cmd =~ /^run(\s+nowatch)?\s+(.*?)$/i) { # RUN command my $id = $2; my $watch = $1 && $1 =~ /nowatch/i ? 0 : 1; if (!$id) { print "FAIL No id specified\n"; next; }; print spawn($id, $watch, $lbuf), "\n"; } elsif ($cmd =~ /^kill\s+(.*?)$/i) { # KILL command my $id = $1; if (my $pid = $CHILDREN{'id'.$id}) { my $fd = $CHILDREN{'id'.$id}{'io'}->fileno; kill SIGTERM, $pid; $select->remove($fd); undef $CHILDREN{'id'.$id}{'io'}; delete($CHILDREN{'id'.$id}); delete($CHILDREN{'pid'.$pid}); delete($CHILDREN{'fd'.$fd}); print "OK\n"; } else { print "FAILED $2 No such id\n"; }; } elsif ($cmd =~ /^quit$/i) { # QUIT command $SIG{CHLD} = 'IGNORE'; foreach my $key (keys %CHILDREN) { next unless $key =~ /pid\d+/; kill SIGTERM, $CHILDREN{$key}{'pid'}; }; print "OK\n"; exit(0); }; $buf =~ s/$savebuf// unless !$savebuf; $buf =~ s/^\s*//g; next YESSIR if $buf; }; } else { # process input from children local $id = $CHILDREN{'fd'.$handle->fileno}{'id'} . " "; foreach my $chunk (split /\n/, $line) { print "${id}${chunk}\n"; + }; }; }; if ($DEADER > 0) { # ZOMBIE ALERT! if (exists($CHILDREN{'pid'.$DEADER})) { my $id = $CHILDREN{'pid'.$DEADER}{'id'}; my $pid = $CHILDREN{'pid'.$DEADER}{'pid'}; my $io = $CHILDREN{'pid'.$DEADER}{'io'}; my $watch = $CHILDREN{'pid'.$DEADER}{'watch'}; my $buf = $CHILDREN{'pid'.$DEADER}{'buf'}; delete($CHILDREN{'id'.$id}); delete($CHILDREN{'pid'.$pid}); delete($CHILDREN{'fd'.$io->fileno}); $select->remove($io->fileno); undef $io; if ($watch) { $ret = spawn($id, $watch, $buf); if ($ret =~ /FAIL/) { print "$ret\n"; } }; } else { print "FAIL Unknown id died\n"; }; $DEADER = 0; next KEEPEMCOMIN; }; }; exit(0); $HANDLE = 1; # initial file handle number $TMPNAME = undef; # temp file name sub spawn { my ($id, $watch, $buf) = @_; unlink $TMPNAME unless !$TMPNAME; ($tmph, $TMPNAME) = File::Temp::tempfile(); print $tmph $buf, "\n"; close($tmph); $HANDLE++; if (my $pid = open('CHILD'.$HANDLE, "-|")) { # parent process her +e my $rd = IO::Handle->new_from_fd(fileno('CHILD'.$HANDLE), "r"); $select->add($rd); my %kid; $kid{'id'} = $id; $kid{'pid'} = $pid; $kid{'io'} = $rd; $kid{'watch'} = $watch; $kid{'buf'} = $buf; $CHILDREN{'id'.$id} = \%kid; $CHILDREN{'pid'.$pid} = \%kid; $CHILDREN{'fd'.$rd->fileno} = \%kid; return "RUNNING $id"; } elsif (defined($pid)) { # child process here $SIG{INT} = $SIG{TERM} = $SIG{CHLD} = 'DEFAULT'; open(STDIN, "<$TMPNAME"); select STDOUT; $| = 1; exec(SCRIPT_PATH) or return "FAIL $id Cannot exec: $!"; } else { return "FAIL $id Cannot fork: $!"; }; };

And no, it ain't working as expected. There are several problems:

Frankly, I don't know what to think. A seemingly easy exercise in Perl Cookbook programming turned out to be a complete nightmare. And I don't even have a clue where it went so wrong... Hope somebody could shed a couple photons on this issue, I'd really appreciate it.

Oh, and there are limitations as well: this script will run under Solaris 9 using perl-5.6.1 provided with the system, and no non-standard modules are allowed. Ideally, this script shouldn't even be installed but run on-the-fly with perl -x after sshing on the box.

Thanks in advance for any input!

Regards,
Alex.

Replies are listed 'Best First'.
Re: Non-blocking I/O woes
by zentara (Cardinal) on Jan 10, 2011 at 18:38 UTC
    The $select->can_read() does not block, despite what documentation says. It keeps reporting that there's something in STDIN

    Have you considered simplifying your code by using an event-loop system? For instance, can you use a simple program like this?

    #!/usr/bin/perl use warnings; use strict; use Glib; my $main_loop = Glib::MainLoop->new; Glib::IO->add_watch (fileno 'STDIN', [qw/in/], \&watch_callback, 'STDI +N'); #just to show it's non blocking my $timer1 = Glib::Timeout->add (1000, \&testcallback, undef, 1 ); $main_loop->run; sub watch_callback { # print "@_\n"; my ($fd, $condition, $fh) = @_; my $line = readline STDIN; print $line; #always return TRUE to continue the callback return 1; } sub testcallback{ print "\t\t\t".time."\n"; } __END__
    POE is another eventloop alternative with an nice POE Cookbook See the section on ProcessManagement


    I'm not really a human, but I play one on earth.
    Old Perl Programmer Haiku ................... flash japh
      Thanks for advice, I'll consider using it as last resort. I need to avoid any extra modules and/or software on that box since that'd break vendor maintenance conditions and you know how anal they're going to be about that. Even more, if I can make this script work it'll probably run on scores of customers' boxes, making maintenance headaches unbearable...
Re: Non-blocking I/O woes
by ikegami (Patriarch) on Jan 10, 2011 at 23:34 UTC

    The $select->can_read() does not block, despite what documentation says. It keeps reporting that there's something in STDIN, even if there's nothing there for sure

    Define "nothing", sysread returning false? Then the handle is closed due to error (undef) or due to eof (zero). You need to handle those conditions. You are calling select (via can_read) on a handle that can't possibly ever return data.


    By the way, the following is buggy:

    my $rv = $handle->sysread($bufline, 4096); $line .= $bufline; while ($rv == 4096) { $rv = $handle->sysread($bufline, 4096); $line .= $bufline unless !$rv; }

    It'll block if there's exactly 4096 bytes waiting. Remove the loop completely. You need to rely on your select (can_read) loop. What follows in the code should just restart the select loop unless it detects that it received a full message. This requires moving your handle's buffer ($line) outside of the loop, of course. In other words, the code should follow the following pattern:

    - While there are handles from which to read, - Wait for data to arrive. - Read into the handle's buf. -> Don't forget to handle eof and error. - While the buf has a full command, - Remove the command from the buf. - Process the command.

    By the way,

    my $bufline = ""; my $rv = $handle->sysread($bufline, 4096); $line .= $bufline;

    can be written as

    my $rv = $handle->sysread($line, 4096, length($line));
Re: Non-blocking I/O woes
by juster (Friar) on Jan 11, 2011 at 04:59 UTC

    I think $select->can_read() is blocking but not in the way you are expecting. can_read will return a handle when the end of file is reached as well. A feature I did not expect is that can_read will return an empty list, the blocking abruptly stopped, when a signal is caught by a signal handler. This means, after the SIGCHLD handler is executed the code execution returns from can_read with an empty list. This empty list means the end of your while loop and script.

    Inside your spawn sub, when you are forking by using open you are not exiting the forked child process. The child will merrily return from the spawn sub and act as a proud new parent. Two forked processes reading the same STDIN is a recipe for bizarre behavior.

    I think it would also be easier to use open to call the "script" program directly instead of forking and then calling the script with exec. If you use open to call the external program then you don't need to worry about exiting the forked process after exec.

    Here is a version I messed with tonight. IPC is always interesting...

    edit: wrapped code in a readmore tag

      my $line = <STDIN> || q{};

      You introduced bugs by moving away from sysread.

      First, you've introduced buffered IO. How can the select system call know that Perl has data waiting to be read in a buffer?

      Second, you're blocking until a newline is received, and that could be an indefinitely long wait.

        The bug doesn't happen because perl has STDIN line-buffered. Select won't notice STDIN has read data until you press enter. This also means there will always be a newline to read for <STDIN>.

        sysread will give you the unbuffered activity you expect but select will not. I think this is where you are mistaken. This means select will still be blocking until you press enter. I was able to get select to trigger on a single character by using tcsetattr to disable line-buffering (aka canonical mode) for the terminal. (code at the end of the post)

        edit: Just to be clear about what this implies, this leads me to believe that line-buffering happens at a lower level than the perl interpreter. This answers your rhetorical question:

        How can the select system call know that Perl has data waiting to be read in a buffer?

        It is my hypothesis that the buffer you speak of happens at the system level and not inside the perl interpreter. Using stdio.h functions like getc and getline, you must read input through the terminal's internal buffer while in canonical mode. The read system call (via perl's sysread) does not use this higher-level terminal line buffer.

        I have no idea how this works in windows.

        #!/usr/bin/env perl use warnings; use strict; use IO::Select; use IO::Handle; use Inline 'C'; my $select = IO::Select->new( \*STDIN ); disable_canon_mode() if @ARGV; # Experiment by commenting out the above line and giving command-line # arguments. while ( 1 ) { $select->can_read; # block until new input warn "CAN_READ STDIN\n"; my $in; if ( @ARGV ) { sysread STDIN, $in, 1; print "Char read: $in\n"; } else { $in = <STDIN>; print "Line read: $in\n"; } } __END__ __C__ #include <termios.h> #define STDIN 0 void disable_canon_mode () { struct termios tio; if ( isatty( STDIN ) == 0 ) { return; } /* Turn off canonical mode, meaning read one char at a time. Set timer to 0, to wait forever. Minimum chars is 1. */ tcgetattr( STDIN, &tio ); tio.c_lflag &= ~ICANON; tio.c_cc[ VTIME ] = 0; tio.c_cc[ VMIN ] = 1; tcsetattr( STDIN, TCSANOW, &tio ); }

      A feature I did not expect is that can_read will return an empty list, the blocking abruptly stopped, when a signal is caught by a signal handler.

      select returns on error, and that includes interruption by signal (EINTR).

      Thanks! That was it - somehow I never expected can_read() to return with an empty handle array upon receiving a signal. But it does and it really explains everything: the silent exit and the erratic behavior and all that. Now that I got this mystery solved I'm happy and productive again. :)