Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:
While investigating a perl process, which seemed to be hung, I found that it was waiting in read(2) on a pipe opened to a dead child process spawned using backtick. The child process died a long time ago and was in the zombie state because the parent was stuck in read(2) and hadn't called wait(2) yet. It's easy enough to reproduce. I ran some tests and this is what I found: - it happens only when spawning child processes using backtick; spawning using system seems to work fine. - it happens when the child is a bash/sh script
The parent perl script - parent.pl --------------------------------------- #!/usr/bin/perl print "I AM PARENT\n"; my $x=`/root/a.sh`; #my $x=`/root/b.pl`; #my $x=`/root/a.out`; #my $x=system("/root/a.sh"); print "PARENT EXITING\n"; Child script - a.sh --------------------- #!/bin/sh echo "I am a.sh......." sleep 6000 echo "I am gonna die ........" exit 123 Another child script but which is perl instead of bash - b.pl --------------------------------------------------------------------- #!/usr/bin/perl print "I am b.pl\n"; sleep(6000000); print "I am gonna die...\n";
Now, execute parent.pl, it creates a pipe to the STDOUT of the child process and waits in read(2). Now, kill the child process; the parent still waits in read(2). One would expect that the death of the child process would close the write end of the pipe which would cause read(2) to return 0 thus causing the parent to terminate too. But instead, read(2) returns ERESTARTSYS and resumes waiting.
[root@onong ~]# ps aux | grep a.sh root 23171 0.0 0.0 63860 1084 pts/4 S+ 03:12 0:00 /bin/sh /root/a.sh root 23514 0.0 0.0 61176 804 pts/2 S+ 03:12 0:00 grep a.sh [root@onong ~]# kill 23171 [root@onong ~]# ps aux | grep a.sh root 23171 0.0 0.0 0 0 pts/4 Z+ 03:12 0:00 [a.sh] <defunct> <--------- +-------------- CHILD BECOMES ZOMBIE root 23967 0.0 0.0 61176 804 pts/2 S+ 03:13 0:00 grep a.sh [root@onong ~]# Strace output of the parent process: [root@onong ~]# strace ./test.pl . . . open("./test.pl", O_RDONLY) = 3 ioctl(3, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fff24ed2260) = -1 ENOTTY (I +nappropriate ioctl for device) lseek(3, 0, SEEK_CUR) = 0 fcntl(3, F_SETFD, FD_CLOEXEC) = 0 fstat(3, {st_mode=S_IFREG|0755, st_size=160, ...}) = 0 rt_sigaction(SIGCHLD, NULL, {SIG_DFL, [], 0}, 8) = 0 readlink("/proc/self/exe", "/usr/bin/perl"..., 4095) = 13 brk(0x170a1000) = 0x170a1000 read(3, "#!/usr/bin/perl\n\nprint \"I AM PAR"..., 4096) = 160 read(3, "", 4096) = 0 close(3) = 0 write(1, "I AM PARENT\n", 12I AM PARENT ) = 12 pipe([3, 4]) = 0 pipe([5, 6]) = 0 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIG +CHLD, child_tidptr=0x2b61c66692e0) = 23171 <--------- CHILD SPAWNED close(6) = 0 close(4) = 0 read(5, "", 4) = 0 close(5) = 0 ioctl(3, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fff24ed21a0) = -1 EINVAL (I +nvalid argument) lseek(3, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek) fstat(3, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0 read(3, "I am a.sh.......\n", 4096) = 17 read(3, 0x17082210, 4096) = ? ERESTARTSYS (To be restarted) <--------- +----- DOESN"T RETURN 0 --- SIGCHLD (Child exited) @ 0 (0) --- read(3, <----------------- KEEPS WAITING IN READ(2)
Strange thing is that the same doesn't happen if the child process is not a bash script. In parent.pl, comment the line which spawns a.sh and uncomment the line which spawns b.pl, which is a perl script. Run the test again.
[root@onong ~]# ps aux | grep b.pl root 29350 0.0 0.0 77884 1488 pts/4 S+ 03:15 0:00 /usr/bin/perl /root/ +b.pl root 29495 0.0 0.0 61176 756 pts/2 S+ 03:15 0:00 grep b.pl [root@onong ~]# kill 29350 [root@onong ~]# ps aux | grep b.pl root 30028 0.0 0.0 61176 748 pts/2 S+ 03:16 0:00 grep b.pl [root@onong ~]# Strace of parent process: read(3, "#!/usr/bin/perl\n\nprint \"I AM PAR"..., 4096) = 160 read(3, "", 4096) = 0 close(3) = 0 write(1, "I AM PARENT\n", 12I AM PARENT ) = 12 pipe([3, 4]) = 0 pipe([5, 6]) = 0 clone(child_stack=0, flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIG +CHLD, child_tidptr=0x2ba7aabac2e0) = 29350 close(6) = 0 close(4) = 0 read(5, "", 4) = 0 close(5) = 0 ioctl(3, SNDCTL_TMR_TIMEBASE or TCGETS, 0x7fff4156e300) = -1 EINVAL (I +nvalid argument) lseek(3, 0, SEEK_CUR) = -1 ESPIPE (Illegal seek) fstat(3, {st_mode=S_IFIFO|0600, st_size=0, ...}) = 0 read(3, "", 4096) = 0 --- SIGCHLD (Child exited) @ 0 (0) --- close(3) = 0 rt_sigaction(SIGHUP, {0x1, [], SA_RESTORER, 0x343740e7c0}, {SIG_DFL, [ +], 0}, 8) = 0 rt_sigaction(SIGINT, {0x1, [], SA_RESTORER, 0x343740e7c0}, {SIG_DFL, [ +], 0}, 8) = 0 rt_sigaction(SIGQUIT, {0x1, [], SA_RESTORER, 0x343740e7c0}, {SIG_DFL, +[], 0}, 8) = 0 wait4(29350, [{WIFSIGNALED(s) && WTERMSIG(s) == SIGTERM}], 0, NULL) = +29350 rt_sigaction(SIGHUP, {SIG_DFL, [], SA_RESTORER, 0x343740e7c0}, NULL, 8 +) = 0 rt_sigaction(SIGINT, {SIG_DFL, [], SA_RESTORER, 0x343740e7c0}, NULL, 8 +) = 0 rt_sigaction(SIGQUIT, {SIG_DFL, [], SA_RESTORER, 0x343740e7c0}, NULL, +8) = 0 write(1, "PARENT EXITING\n", 15PARENT EXITING ) = 15 exit_group(0) = ?
I also ran the test with the child process being a C program. It works fine. So, it would seem that the perl interpreter is doing some special processing for sh/bash scripts?? Thanks.
|
|---|