krabbl has asked for the wisdom of the Perl Monks concerning the following question:
Hi, I have a problem with catching child-processes after a SIG INT was caught by my sig-handler which only sets a global var (don't blame me for using global vars). My script is forking a certain amount of childs and the parent is supposed to wait until all children have finished gracefully. Usually this works fine, but if I press CTRL-C the script behaves strangely.
As mentioned before my sig-handler only sets a global var to 1. This var is checked by each child at certain points and if set to 1 the child is supposed to only write some data to a BerkeleyDB and afterwards untie and undef some hashes/vars (used with BerkeleyDB). After this is done the child should exit and the parent could catch it. For some children this works very fast but for others not. It seems that these children are frozen. After waiting some time, some children are finishing yet again, but mostly I have to kill all processes manually and so I'm loosing the data which hasn't yet been written to the BerkeleyDB until now.
I tried different approches like Parallel::ForkManager and the same behaviour occurs. If I'm using waitpid with WNOHANG my CPU is used to 100% and nothing seems to work anymore (regarding to my script). Sometimes the parent goes ahead and does not wait until the childs are finished (should be normal behaviour ?!?). If this is normal behaviour then how can it be that sometimes my CPU is up to 100% with nothing going on and sometimes my parent goes ahead without waiting? If I'm using wait() oder waitpid($_,0) then everything works fine except that not all children could be catched (obviously cause the are not completed), but it seems that they are not finishing at all. Some of them do after some time, but others do not.
This shit really drives me crazy, because I have to implement a method to end my programm properly with the current "status" saved (thats what the children are doing when the write data to BerkeleyDB and some files), so that it can be resumed later when starting the whole script with the related argument.
Because of legal reasons I'm not allowed to post the whole script, I'm sorry for that but I hope you could still help me
Three dots are standing for removed unimportant code like vars, checks and sys-call cmds (bash commands only).
$SIG{INT} = \&ctrlc; #$SIG{CHLD} = \&ripZombie; foreach(@ARGV){ my $arg=$_; unless(defined($arg)){ die("$!\n") if(help()); exit 0; } if($arg eq '-e' || $arg eq '--erase'){ die("$!\n") if(erase()); } elsif($arg eq '-i' || $arg eq '--info'){ die("$!\n") if(versionInfos()); } elsif($arg eq '-c' || $arg eq '--create'){ die("$!\n") if(createData()); } elsif($arg eq '-p' || $arg eq '--proceed'){ die("$!\n") if(continueDataCreation()); } elsif($arg eq '-v' || $arg eq '--verify'){ die("$!\n") if(verifyData()); } elsif($arg eq '-s' || $arg eq '--search'){ die("$!\n") if(searchExtInfos()); } elsif($arg eq '-h' || $arg eq '--help'){ die("$!\n") if(help()); } elsif($arg eq '-x' || $arg eq '--extract'){ die("$!\n") if(fallbackExtractResults()); } else{ die("$!\n") if(help()); } } sub createData{ ### Set default sig-handler for ctrl+c ### $SIG{INT} = "DEFAULT"; ... my @a_childs=(); ... ### Set custom sig handler for ctrl+c ### $SIG{INT} = \&ctrlc; ### Begin forking regarding amount of instances ### for(my $i=0; $i<$AMOUNT_INSTANCES; $i++){ my $child=fork(); if($child){ ### parent ### push(@a_childs, $child); } elsif($child == 0){ ### child ### ... ### Build BerkeleyDB Environment ### my $envChild = new BerkeleyDB::Env( -Home => "$WORKDIR" , -Flags => DB_CREATE| DB_INIT_CDB | DB_INIT_MPOOL) or die "FAILURE: Cannot open environment: $BerkeleyDB:: +Error\n"; ### Tie Hash ### our $DB_RESULT = tie %H_RESULT, 'BerkeleyDB::Hash', -Filename => "$FILE_RESULT_DB", -Flags => DB_CREATE, -Env => $envChild or die "FAILURE: Cannot open database: $BerkeleyDB::Error\ +n"; ### Tie Hash for IP-pref-pairs ### my $db_prefChild = tie %H_PREF, 'BerkeleyDB::Hash', -Filename => "$FILE_PREF_DB", -Flags => DB_CREATE, -Env => $envChild or die "FAILURE: Cannot open database: $BerkeleyDB::Error\ +n"; ### only copies some files to certain positions ### system("..."); ### Query responsible RIR and extract information ### while(keys(%H_PREF) > 0){ ### Lock hash ### my $lock=$db_prefChild->cds_lock(); ### extract random key and mark it ### my $key=""; my $value=""; do{ $key=(keys %H_PREF)[rand keys %H_PREF]; $value=$H_PREF{$key}; }while($value =~ m/^inuse\_/); $H_PREF{$key}="inuse\_$value"; ### Unlock hash ### $lock->cds_unlock(); ### this funktion does only some regex work and system-calls, at begin + and end the var set by sig handler is checked ### handlePref($key,$value); ### Check if SIG{INT} (CTRL+C) was pushed ### if($CHECK_CTRL == 1 ){ $lock=$db_prefChild->cds_lock(); $H_PREF{$key}=$value; $lock->cds_unlock(); last; } delete($H_PREF{$key}); } warn("$!\n") if(cleanUpChild($childPid)); ### Clean Up BerkeleyDB Env and untie ## undef $DB_RESULT; untie %H_RESULT; undef $db_prefChild; untie %H_PREF; exit 0; } else{ return 1, warn("FAILURE: Could not fork! $!\n"); } } ### Wait/catch all childs ### foreach(@a_childs){ while(){ my $pid=waitpid(-1, WNOHANG); last if($pid <= 0); } print "Child ended successfully (PID: $_)!\n"; } #foreach(@a_childs){ # wait(); # waitpid($_,0); #} ... return 0; } ### sig handler for ctrl-c ### sub ctrlc{ $SIG{INT} = \&ctrlc; ### Set global check variable to 1 ### $CHECK_CTRL=1; print("! Please be patient, programm is existing but will end all +current queries and this could take some minutes !\n"); return 0; } ### reaper but not in use ### sub ripZombie{ my $pid; while ((my $pid = waitpid(-1, WNOHANG)) > 0) { ### If you want to do sth with your reaped child pids ### ; } $SIG{CHLD} = \&ripZombie; }
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Could not catch all children after fork, some of them never end
by educated_foo (Vicar) on Mar 13, 2013 at 11:51 UTC | |
by krabbl (Initiate) on Mar 13, 2013 at 12:43 UTC | |
by educated_foo (Vicar) on Mar 13, 2013 at 14:40 UTC | |
|
Re: Could not catch all children after fork, some of them never end
by soonix (Chancellor) on Mar 13, 2013 at 23:10 UTC | |
by krabbl (Initiate) on Mar 14, 2013 at 16:08 UTC | |
|
Re: Could not catch all children after fork, some of them never end
by locked_user sundialsvc4 (Abbot) on Mar 13, 2013 at 14:37 UTC | |
by Anonymous Monk on Mar 14, 2013 at 02:24 UTC |