talexb has asked for the wisdom of the Perl Monks concerning the following question:

I am running an external program and examining the output as part of an event handler. The qstat utility is part of the Sun Grid Engine and tells me what jobs are in the queue.

open(QSTAT, 'qstat|') or $self->_die("failed to start qstat ($!)"); while (<QSTAT>) { # Read the results, act on them } close QSTAT or ($! ? $self->_die("close qstat ($!)") : $logger->error("close qstat ($?)") );
I compare that list of queued and running jobs against a list of jobs I've previously submitted to the grid engine, and note which jobs appear to have finished (because they're not listed by qstat anymore), and deal with them.

A bug that's popped up recently is that I'm getting a $? error code of 6 back from close. That's either SIGABRT or SIGIOT, as per /usr/include/asm/signal.h .. I'm just trying to understand what that means, and to find out if Perl is doing anything behind the scenes that I should know about.

On top of the error, I'm getting no data back, so it seems like Linux is not running my command for some reason. Because nothing comes back, my code assumes (ha) that all of the grid engine jobs have finished .. which is a problem.

I've googled, I've search this site, but haven't found anything that helps me understand what this error means.

Alex / talexb / Toronto

"Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds

Replies are listed 'Best First'.
Re: Closing a piped command gives SIGABRT/SIGIOT
by dave_the_m (Monsignor) on Feb 18, 2005 at 02:42 UTC
    $? is telling you that the child process died with a signal, so its unlikely to be anything to do with perl itself.
    Try something like
    open(QSTAT, 'strace -o /tmp/tr qstat|')
    to see what's going on in the child process. (A side-effect of this is that perl will now see the exit status of strace rather than qstat, so close() won't report anything wrong.)

    Dave.

      Ah, strace. Good catch. I'll try that out. Thanks for the suggestion.

      Alex / talexb / Toronto

      "Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds

      Dave, thanks again for your suggestion. I have now tracked this extensively over the past two days and we finally found what we think is the problem .. a bad NIC. This was causing timeouts, dropped RPC commands and the appearance that the kernel was running out of buffers.

      Alex / talexb / Toronto

      "Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds