in reply to killing threads inside forks

Since you haven't shown your code for how you're handling the signals, I can only hazard a guess. The OS SIGUSR1 comes to each child process' main thread (because that's how it works). Then the child process in turn does a $thread->kill() on each of your threads. So far so good?

See threads:

sending a signal to a thread does not disrupt the operation the thread is currently working on: The signal will be acted upon after the current operation has completed. For instance, if the thread is stuck on an I/O call, sending it a signal will not cause the I/O call to be interrupted such that the signal is acted up immediately.

The good news is, there's a simple answer to the question of how to send signals to threads: don't. It's unreliable at best, and even if it worked, your threads would be in random states. Instead, instrument your threads with their own timeouts (if they're doing blocking operations, that's really where the timeouts belong anyway), and have the threads poll a threads::shared $shutdown variable that's set by your signal handler. Provided you have reasonable timeouts and don't just busy-wait, this is a reasonable use of polling. With the above caveat, remember your child program's main application flow is a thread, too, and subject to the same limitations. When the signal comes in, you give all the threads a few seconds to exit, and if there are still any running, you detach them and exit with a warning.

bash $ (sleep 1; killall -HUP the_parent) & perl threads.pl [1] 4842 Thread 1: Created Thread 2: Created Thread 3: Created Shutting down 3 threads Thread 1: Shutting down Thread 3: Shutting down 1 still running. Giving up!
#!/usr/bin/env perl use 5.012; use warnings FATAL => 'all'; use autodie; if (my $pid = fork) { $0 = 'the_parent'; local $SIG{HUP} = sub { kill USR1 => $pid }; exit(0 < waitpid $pid,0); } $0 = 'the_child'; use threads; use threads::shared; my $shutdown :shared = 0; my @thr = map { threads->create(sub { printf "Thread %d: Created\n", threads->tid; sleep 1 + rand 2 until $shutdown; printf "Thread %d: Shutting down\n", threads->tid; lock($shutdown); cond_signal($shutdown); }) } 1..3; $SIG{USR1} = sub { $shutdown = 1 }; sleep until $shutdown; say "Shutting down " . scalar threads->list . " threads"; my $until = 1 + time; # Wait this many seconds before abort while (my $threads = threads->list(threads::running)) { lock($shutdown); cond_timedwait($shutdown, $until) or last; } if (my @remain = threads->list(threads::running)) { $_->detach for @remain; $_->join for threads->list(threads::joinable); die scalar @remain . " still running. Giving up!\n"; } $_->join for @thr; say "All threads exited normally.";

Replies are listed 'Best First'.
Re^2: killing threads inside forks
by mojo2405 (Acolyte) on Jul 31, 2013 at 14:45 UTC
    I better be more clearer and show some code.. So this is the father starting his children :
    #Start fork foreach my $xml_obj (@{$xml_ojects_ref}){ my $pid = fork(); if ($pid) { # parent push(@childs, $pid); $framework_child_procs{$pid} = 1; }elsif($pid == 0) { # child my $results = new testProcess($xml_obj,$fatherSN); #Test proce +ss is a new run exit 0; }else { print ,"couldnt fork: $!"; exit 1; } } #Wait for to end foreach my $child (@childs) { my $tmp = waitpid($child, 0); print "done with pid $tmp"; }
    This is the code (inside the father class) , which handles the CTRL+C:
    $SIG{INT}=\&ctrl_c_handler; sub ctrl_c_handler { $SIG{'INT'}='IGNORE'; my @procs = keys (%{fatherProcess::framework_child_procs}); kill SIGUSR1 => @procs; foreach my $child (@procs) { my $tmp = waitpid($child, 0); } }
    This is a code inside a child (this is how I start the threads inside a child actually):
    my $t = threads->new (sub{ local $SIG{'USR1'}=sub {threads->exit();}; \&$functions_name(@parameters); }); my $result = $t->join();
    Now the signal (CTRL+C) comes from the user / CLI , then it catched by the father (going to ctrl_c_handler function) , and then ctrl_c_handler function suppose to send the USR1 signal to the threads - but the thread doesn't catch it. In addition - in the child (testProcess), before starting the thread , I have another USR1 handler , which caught - only after the thread is ending. I think your solution is not suitable here.. please help :)

      Hello mojo2405! Thank you for posting the additional details.

      Now the signal (CTRL+C) comes from the user / CLI , then it catched by the father (going to ctrl_c_handler function) , and then ctrl_c_handler function suppose to send the USR1 signal to the threads - but the thread doesn't catch it. In addition - in the child (testProcess), before starting the thread , I have another USR1 handler , which caught - only after the thread is ending.

      The approach you're taking certainly will not work, because I don't see any evidence in your child process code that you are catching SIGUSR1 in the main thread and re-sending it to all of your threads one by one, as covered in threads:

      Catching signals

      Signals are caught by the main thread (thread ID = 0) of a script. Therefore, setting up signal handlers in threads for purposes other than THREAD SIGNALLING as documented above will not accomplish what is intended.

      This is especially true if trying to catch SIGALRM in a thread. To handle alarms in threads, set up a signal handler in the main thread, and then use THREAD SIGNALLING to relay the signal to the thread:

      mojo2405 wrote:
      I think your solution is not suitable here..

      Oh? Why is that? The one bit of code you have not shown or described is the code that actually runs in your thread (i.e., functions_name()). And, from what you've posted, your child process starts a single thread, waits for that thread to finish, and exits. Why do you even need threads at all? And why can't your threads exit based on polling a shutdown flag instead of a (fake) signal (which, by the way, is implemented via a glorified polling mechanism itself!)

      please help :)

      Believe me, I'm trying. :-)

        Hey rjt ! thank you for your fast answers ! I though I wouldn't need to do so , but I have to expose my code, maybe I'll be clearer. I want to thank you first for your help , I know you're trying :) First I want to say that my code can't have any timeout, cause it can be running for days or even weeks, so timeout for the threads is not useful here. I tried to add an alarm trigger for the children , but still - same problem. Now , for the code , there is alot of code so I'll try to show what is relevant for our case and cut some of it. Father :
        sub init { my ($logger_input,$xmlFile_input,$ini_file_input,$run_all_input,$g +ui_sync_file_input,$dump_input) = @_; #Setting and reading params from run_me readParamsFromRunme($logger_input,$xmlFile_input,$ini_file_input,$ +run_all_input,$gui_sync_file_input,$dump_input); #Handle Ctrl+c if not running via GUI $SIG{INT}=\&ctrl_c_handler if (!$gui_sync_file); ##kill -10 from UI $SIG{USR1}=\&usr_kill_handler; } # FUNCTION NAME: ctrl_c_handler # WRITTEN BY: Sagi # DATE: 1.11.2011 # PURPOSE: Exit the test when user press Ctrl+c # IN PARAMS: None # RETURNED VALUES: None # Sample call: sub ctrl_c_handler { #Wait for to end $SIG{'INT'}='IGNORE'; my @procs = keys (%{fatherProcess::framework_child_procs}); kill ALRM => @procs; kill SIGUSR1 => @procs; foreach my $child (@procs) { my $tmp = waitpid($child, 0); } $father_self->finishRun("user stoped"); } sub usr_kill_handler { my $signame = shift; printer::Logger_screen_message("info","got signal $signame",0,$log +ger); $father_self->finishRun("user stoped"); }
        Start forking the childs from the father :
        sub start_fork_run{ my $self = shift; my ($xml_ojects_ref,$night_run_flag) = @_; my @childs = (); #Get father PID my $father_pid=$$; #Get father SN my $fatherSN = $self->getSN(); if (!$fatherSN){ printer::Logger_screen_message("error","can't get father's SN from + DB",0,$logger); exit 1; } #Start fork foreach my $xml_obj (@{$xml_ojects_ref}){ my $pid = fork(); if ($pid) { # parent push(@childs, $pid); $framework_child_procs{$pid} = 1; }elsif($pid == 0) { # child my $results = new testProcess($xml_obj,$fatherSN,$gui_sync_fil +e,$father_pid,$night_run_flag); exit 0; }else { printer::Logger_screen_message("error","couldnt fork: $!",0,$l +ogger); exit 1; } } #Wait for to end foreach my $child (@childs) { my $tmp = waitpid($child, 0); delete $framework_child_procs{$child}; printer::Logger_screen_message("info","done with pid $tmp",0,$logg +er); } #End the whole topology run $self->finishRun("finish"); }
        a child object :
        package testProcess; #use strict; use warnings; use Data::Dumper; use Getopt::Long; use File::Basename; use File::Temp; use Storable ('dclone'); use logger; use reporter; my $logger; our $local_xml_obj; my $gui_sync_file; my $night_run; my $packageReporter; sub new { my $class = shift; my ($xml_obj,$fatherSN,$gui_sync_file_input,$father_pid,$night_run +_input) = @_; #kill -10 from UI $SIG{'USR1'}=\&prc_kill_handler; #Reseting signals so only father will catch them $SIG{'INT'}='IGNORE'; $SIG{'ALRM'}=sub {foreach my $thr (@parallel::threads){ $thr->kill('ALRM'); } }; } sub prc_kill_handler { my $signame = shift; printer::colorized_msg("\n got to prc_kill_handler \n","green"); $logger->info("usr_kill_handler: got signal $signame line: ".__LIN +E__); testProcess::finishPackage("user stoped"); testProcess::exit_run(); }
        now each child can call this parallel function (which create threads as many as he wishes) , and here is the problem (that it catches the signal only after the threads end) :
        sub Functions_Run_in_threads_windows { my (%hash)=@_; printer::Logger_screen_message ("Info", "Start", "", $logger); my @temp_file; if(!%hash) { printer::Logger_screen_message("Error","Functions_Run_in_threads +: The hash is empty",1,$logger); return %hash; } my ($functions_name,@temp_params,@parameters,%hash_results); @threads = (); for( my $i=0 ; $hash{"params_name_$i"} ; $i++) { @temp_params=$hash{"params_name_$i"}; $functions_name=$hash{"function_name_$i"}; @parameters=(); #push (@parameters,$functions_name); if(!@{$temp_params[0]}) #check input if not reference return fai +l { printer::Logger_screen_message("Error","Please send parameter +s array as referens",1,$logger); return %hash; } #parse hash table and run the function for( my $index=0 ; $index<@{$temp_params[0]} ; $index++) { if($temp_params[0][$index]) { $tmp=$temp_params[0][$index]; push(@parameters,$tmp); } else { push(@parameters,""); } } #Start new thread my $t = threads->create (sub{ $SIG{ALRM} = sub { die("Timeout\n"); }; \&$functions_name(@parameters); }); push(@threads,$t); } my $index=0; foreach my $thread (@threads) #wait for all threads untill the end +and insert results to hash { $hash_results{$index}=$thread->join; $index++; } return %hash_results; }
        Hope now you better understand it (and the problem). Thanks again.