jyoshv has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

My objective is to write a sub program to gzip 100s of files and move it to the different directory on a server. This is to enhanse our logrotation script. Sequentially gziping 100s of file is taking a lot of time and I want to increase the speed by spawning 10 new processes to Gzip the files. As a child process dies, I want to create another process so that at any point in time, there are 10 or less processes running on the server. This is inorder to makesure the resources are not run out.

Since this is the first time I'm working with fork() in perl, I thought I'll write a simpler program. Here is a perl program which I found on the net and modified just a lil bit. All this does is to spawn 3 processes to cat 3 files and as each child dies, It prints a message saying "Child Died" and exits after all the 3 children are dead.

However, I'm always finding either 1 or 2 of those children hanging. Please some one take a look at the code and tell me what is going wrong.

Thanks, Jyothi

PS : BELOW IS THE SCRIPT.

#!/usr/bin/perl -w use strict; use POSIX qw(:signal_h :errno_h :sys_wait_h); sub sched{ my $file = shift || die "Missing parameter for subroutine call\n"; my $time = shift || die "Missing second parameter on subroutine call ++\n"; my $chd = shift; sleep 5; my $cmd = `cat $file`; print " Child $chd in sched \n"; print $cmd; } # main program my $pid; # process id my $counter = 0; my %children; # Hashtable to hold all pids of our children my @childs; # will hold all pids of our children $SIG{CHLD} = sub{ my $pid; my $j = -1; $pid = waitpid(-1, &WNOHANG); if (WIFEXITED($?)) { # Remove the child that + just died if($children{$pid}){; $j = $children{$pid}; delete $children{$pid}; delete $childs[$j]; print "Process $pid exited $j and $#childs\n"; } } }; $SIG{TERM} = sub{ kill -9, @childs; exit(0) }; # allow to end program by pressinc Ctrl-C or a kill call #this will hold all of our tasks my @tasks = ( { file => "msg.1", time => 10, }, { file => "msg.2", time => 2, }, { file => "msg.3", time => 5, }, ); # looping through the array of hashes and forking with each element my $j =0; FORLOOP: for my $i ( 0..$#tasks){ FORK: { if ($pid = fork){ # first fork # we are parent push @childs,$pid; $children{$pid} = $#childs; print "Parent forked child # ===> $#childs, pid = $pid \n"; next FORLOOP; }elsif (defined $pid){ # so we are a child print "Executing child $j \n"; sched($tasks[$i]{file},$tasks[$i]{time}, $j); print "Done! pid = $pid\n"; exit(0); }elsif ($! =~/No more process/){ sleep 5; redo FORK; }else{ die "Something is wrong: $!\n"; } } } print "started all children $#childs \n"; while ($#childs > -1 ){;} #Exit after all children are dead.

Replies are listed 'Best First'.
Re: need help with Hanging Child Processes
by Limbic~Region (Chancellor) on Jan 11, 2005 at 17:28 UTC
    jyoshv,
    This seems like an ideal candidate for Parallel::ForkManager.

    The following code was added after the initial post. It is intended as an example and not specifically addressing the problem at hand.

    Cheers - L~R

      Thanks. I am looking into it.
Re: need help with Hanging Child Processes
by BUU (Prior) on Jan 11, 2005 at 18:53 UTC
    Also not addressing your specific problem, but what makes you think multiple processes will speed up your task? Is the box multi-processor? Are all the files on different hard drives?

    As far as I know, gzip operations are typically CPU bound, which means that their speed is entirely dependent on how fast your CPU can process. So adding another process will just make both take twice as long. This is also true if you are hard drive bound - adding more processes won't speed it up.
      BUU, Yep the box is a multi-processor. The files are on 2 hard drives.
Re: need help with Hanging Child Processes
by bluto (Curate) on Jan 11, 2005 at 19:03 UTC
    Are the hung children printing anything before they hang? If not, are they blocked trying to output (i.e. check with 'ps'). Normally when a child is forked, you'll want to redirect its output somewhere else (e.g. a file or a pipe to the parent process) to avoid having the system block them when they try to output.

    One quick and dirty way to check if it's blocking on output is to temporarily comment out the print statements in the children. If things start working, then this is probably the problem.

      Thanks for your reply. Actually, They are done printing all they have to. "Done! pid = 0" is the last statement before exit(0) for the child. I'm not sure.. what is stoping from exiting Parent forked child # ===> 0, pid = 6382 Parent forked child # ===> 1, pid = 6383 Parent forked child # ===> 2, pid = 6384 started all children 2 Argument "CHLD" isn't numeric in subroutine entry at /usr/lib/perl5/5.6.1/i386-linux/POSIX.pm line 38. Child 0 in sched This is Msg1 Done! pid = 0 Argument "CHLD" isn't numeric in subroutine entry at /usr/lib/perl5/5.6.1/i386-linux/POSIX.pm line 38. Argument "CHLD" isn't numeric in subroutine entry at /usr/lib/perl5/5.6.1/i386-linux/POSIX.pm line 38. Child 0 in sched This is Msg3 Done! pid = 0 Child 0 in sched This is Msg2 Done! pid = 0 Argument "CHLD" isn't numeric in subroutine entry at /usr/lib/perl5/5.6.1/i386-linux/POSIX.pm line 38. Process 6384 exited When I do ps -ef, I can see the parent and 2 of its children hanging. Not sure what to make of that. /opt/jyo/ > ps -ef | grep perl root 6381 4559 98 15:15 pts/0 00:11:56 perl jfork.pl root 6382 6381 0 15:15 pts/0 00:00:00 perl <defunct> root 6383 6381 0 15:15 pts/0 00:00:00 perl <defunct>