santosh_sugur has asked for the wisdom of the Perl Monks concerning the following question:

Hello PerlMonks,
Please help me with the following problem..

My goal is to fill a file - called testfile with data by cat'ing /dev/urandom so I have written the following code it. I just want to fill the file for about 1 second and then I want to kill the process.

#!/usr/bin/perl use warnings; use strict; if (-f "./testfile") { !system "rm -f testfile" or die "Couldn't delete already present testfile\n"; } my $pid; my $parent = $$; open (INFILE, "+>testfile") or die "Can't create testfile: $!"; print "This is parent $parent \n"; defined($pid = fork) or die "Cannot fork: $!"; unless($pid) { # Child process is here print "This is child pid $$\n"; exec "cat /dev/urandom > testfile"; } sleep 1; close INFILE; kill -9 => $pid;
When exec is executed here actually two processes are created and though it baffled me initially I realized there were actually two things going on in the exec call; one is the cat /dev/urandom and the second I believe is the shell is being invoked to carry out the redirection. So the output of the program is as follows
# ./test.pl This is parent 29307 This is child pid 29309 #
Now even after the prgram finishes execution (that is I get the prompt back), the testfile is still being populated from the output of /dev/urandom. This should not happen. and a ps -ef gives the following

# ps -ef|grep urandom root 29309 1 0 00:53 pts/2 00:00:00 sh -c cat /dev/urandom > testfile root 29310 29309 99 00:53 pts/2 00:00:12 cat /dev/urandom
My questions:
1. When I kill the process process 29310 testfile stops growing, but how do I get that to happen from within the code?
2. Why does process 29309 still show when I have killed it in my code?

Thanks in Advance,
Santosh

Replies are listed 'Best First'.
Re: exec creates two processes.. how to kill them?
by jettero (Monsignor) on Jan 10, 2008 at 16:26 UTC

    Rather than telling you how to find the pids and kill them, I'm choosing to describe a better way to "cat" the device, that requires no extra pids.

    open my $in, "/dev/urandom" or die "couldn't open random: $!"; open my $out, ">a_file.bin" or die "couldn't open output file: $!"; eval { alarm 1; my $buf; read $in, $buf, 1024 or die "error reading: $!"; my $orig = select $out; local $| = 1; select $orig; print $out $buf; }; close $in; close $out;

    That was from memory, so if I got something wrong, don't sue me or anything. I also think you can make it a lot better than what I put above, but I also think all the right stuff is in there so you can cook from it with little trouble.

    -Paul

      Thank you Paul. That worked just as you had typed. I also learnt a lot of things from that small bit of code. This is my first experience here.. and its really been enlightening :)
Re: exec creates two processes.. how to kill them?
by Eimi Metamorphoumai (Deacon) on Jan 10, 2008 at 17:18 UTC
    You've already got an answer for a better way to do it (and I'll mention in passing that you should probably use unlink instead of shelling out for rm), but I'll answer the question you actually asked.

    What's happening is that, because of the redirection, perl is spawning a shell, and passing the command line you specified to it. That's your 29309 process. That shell, then, sets up the stdin and stdout filehandles, and itself forks another process (the cat process, 29310). As far as how to fix it, there are almost certainly ways to parse the output of ps or do all sorts of crazy things, but the simplest answer is to just do the copying internally in perl. In fact, in that case instead of copying "for about a second", you can decide on exactly how much data you want to copy, and just use that.

      To make it complete, here's a working code that reopen the STDOUT in the child process in order to avoid altogether the redirection in the command. Then, the exec do not spawn the shell.
      #!/usr/bin/perl use warnings; use strict; if (-f "./testfile") { !system "rm -f testfile" or die "Couldn't delete already present testfile\n"; } my $pid; my $parent = $$; print "This is parent $parent \n"; defined($pid = fork) or die "Cannot fork: $!"; unless($pid) { print "This is child pid $$\n"; open (STDOUT, '>', 'testfile') or die "Can't create testfile: $!"; select STDOUT; $| = 1; my @cmd = qw(cat /dev/urandom); # Child process is here exec { $cmd[0] } @cmd; } sleep 1; kill 15 => $pid; wait;

      The slightly less crazy way to do it if you're bound and determined to use the shell to do redirection (which is indeed silly in this particular case) is to use the shell's exec command to tell it to exec the command you're asking it to run in place of itself (well, that perl is indirectly asking it to run on your behalf because you used the single argument form of exec with a string containing shell metacharacters).

      The cake is a lie.
      The cake is a lie.
      The cake is a lie.

Re: exec creates two processes.. how to kill them?
by almut (Canon) on Jan 10, 2008 at 18:59 UTC

    You've already been given better solutions for your task, but it's still interesting to understand why your approach didn't work... so

    2. Why does process 29309 still show when I have killed it in my code?

    You're using a negative signal (intentionally or not), which means you're sending the signal to a process group (29309 in your case). This is generally a reasonable approach here (because it would have the cat process be signaled as well, which would otherwise remain running, if you kill the shell only). It didn't work, however, as a process group with that ID didn't exist.

    When doing a ps (with customized output options (Linux syntax)) while all processes were still running, you'd have gotten something like:

    $ ps axf -o pid,ppid,pgrp,cmd PID PPID PGRP CMD ... 29307 4061 29307 \_ /usr/bin/perl ./661639.pl 29309 29307 29307 \_ sh -c cat /dev/urandom > testfile 29310 29309 29307 \_ cat /dev/urandom

    which shows that the process group is not 29309, but rather 29307, because the shell in this case doesn't create a new process group.

    In other words, one of the following would have worked (note that you don't need signal 9 here, 15 (SIGTERM) is absolutely sufficient):

    kill -15 => getpgrp($pid); # process group of child kill -15 => getpgrp; # process group of current process (the sc +ript) # or even (as $$ equals the process group in this case): kill -15 => $$;

    ( A problem could be (depending on context) that the script does get killed, too... but I won't try to come up with a workaround for that, because it's moot by now, anyway. )

      Thank you Eimi, chem, Fletch , and almut for painstakingly explaining what I was getting into...
      It's been a good day at the monastery.