in reply to Re^2: Perl Thread Quitting Abnormally
in thread Perl Thread Quitting Abnormally

Thread 20 terminated abnormally: panic: COND_SIGNAL (298) at cm.pl line 487. Terminating on signal SIGINT(2)

The error breaks down into several parts:

  1. Thread 20 terminated abnormally:

    Fairly obviously means that thread number 20 was started, but it terminated as a result of something other than return or 'running off the end of the sub'.

  2. panic: COND_SIGNAL (298) at cm.pl line 487.

    This is the reason it terminated. Perl (threads.pm) itself terminated it because of an unexpected internal error condition (panic).

    In this case, the code is executing (either explicitly in your code or implicitly through perl internal code):

    #define COND_SIGNAL(c) \ STMT_START { \ if ((c)->waiters > 0 && \ ReleaseSemaphore((c)->sem,1,NULL) == 0) \ Perl_croak_nocontext("panic: COND_SIGNAL (%ld)",GetLastError() +); \ } STMT_END

    The 298 is the system error code returned by GetLastError after the ReleaseSemaphore() call fails.

    It translates to "Too many posts were made to a semaphore." which is further explained as

    There is a limited number of times that an event semaphore can be posted. You tried to post an event semaphore that has already been posted the maximum number of times.

    If you are making extensive use of threads::shared::cond_* calls in your code, that could be the root of the problem. If you want help in debugging that further, you will have to "show us the code".

    If you are not using the cond_* calls in your code, then it could be that you've unearthed a bug in Perl itself. You might try upgrading your versions of Perl to (say) 5.10.1. And/or your version of threads &| threads::shared.

  3. Terminating on signal SIGINT(2)

    Under most circumstances, this will only occur if you (or someone) types ^C. Did you fail to mention that your application is hanging and you get the above error message only when you interrupt it?


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
RIP an inspiration; A true Folk's Guy

Replies are listed 'Best First'.
Re^4: Perl Thread Quitting Abnormally
by Anonymous Monk on Jul 06, 2010 at 09:38 UTC

    Hi,

    Thanks for the help, will give me something to investigate. For now, simply turning the number of threads down to 5 seems to have helped.

    I do use $threads$threadno->kill('STOP'); in the code to stop threads that go on for too long. I then trap this with $SIG{'STOP'} = sub {$end_ne=1;}; and test for the value of $end_ne in the code at the end of any part that may take a while. Could this be causing the semaphore errors?

    I can't just kill the long running thread a re-create it, I need to simply back out of the current NE in that thread and set it to a state that the handler can pass it a new NE to try. Unfortunately Perl uses memory up (~5MB from memory) every time a thread starts and doesn't release it until the whole program exits. Given the script could run on 5000 NEs, a major network interuption could therefore see nearly 5000 threads created and killed!!

    I'm not typing ^C and noone else is logged in.

    Thanks

    Graham

      I do use $threads$threadno->kill('STOP'); in the code to stop threads that go on for too long. I then trap this with $SIG{'STOP'} = sub {$end_ne=1;}; and test for the value of $end_ne in the code at the end of any part that may take a while. Could this be causing the semaphore errors?

      Quite likely.

      I do not use signals in conjunction with threads as my initial experiments with them show they a) often seemed to the source of mysterious problems; b) made for hard to debug code; c) achieved nothing that was not more easily and better achieved in other ways.

      For example, for your purpose of interrupting a long running thread by polling the state of a variable, simply making that variable shared and then setting it true from a different thread, achieves the same end without the additional complexities of out-of-line callbacks and all the nastiness that underlies them:

      my @end_ne :shared = (0) x NTHREADS; ... sub threadHandler{ my $tid = threads->tid; ... if( $end_ne[ $tid ] ) { return; } ... } ... if( time() > ... ) { $end_ne[ $someTid ] = 1; }
      Unfortunately Perl uses memory up (~5MB from memory) every time a thread starts and doesn't release it until the whole program exits.

      Hm. Sounds like you are failing to join your old threads, as that is the only way they would continue to consume memory after death. (Most of) Their memory will not be returned to the OS, but it will be returned to the process memory pool for reuse, unless you fail to join them.

      By way of demonstration. The following program starts (checks memory), creates 50 concurrent threads (checks memory), and then signals one thread to die and then replaces it with another until 5000 threads have been created and destroyed.

      After the first 50 are created, the memory stands at 123.4 MB. Subtracting the start-up size of 6.6 MB, that gives 2.3 MB/thread. It then goes on to create and destroy 4950 more threads in quick succession--takes about a minute on my system--and when it's done the total process memory pool has increased to 137.1 MB. Subtract that used by the first 50 and you get 13.7MB/4950 = 0.00276MB/thread. That's just about 3k, and is probably just caused by heap fragmentation.

      Not that I would advocate this method of threading for your application--a pool of threads is the right way to go--but it does lay bare one of many misinterpretations that are made about threaded code.

      #! perl -slw use strict; use threads ( stack_size => 4096 ); use threads::shared; my @end :shared = (0) x 5000; sub thread { my $tid = threads->tid; Win32::Sleep( 10 ) until $end[ $tid ]; --$end[ $tid ]; return; } printf "Check memory: "; <>; threads->create ( \&thread )->detach for 1 .. 50; printf "Check memory: "; <>; for my $tid ( 1 .. 4950 ) { printf "\r$tid"; ++$end[ $tid ]; Win32::Sleep( 10 ) while $end[ $tid ]; threads->create ( \&thread )->detach; } ++$end[ $_ ] for 4950 .. 5000; printf "\nCheck memory: "; <>; __END__ c:\test>t-junk.pl Check memory: 6.6 MB Check memory: 123.4 MB 4950 Check memory: 137.1 MB

      On the basis of the scant description of your application, I think that it could probably be greatly improved with a few tweaks to the mechanisms you are using for 'command & control'.

      Is it possible for you to post the shell of the application--the main code where you create the threads and thread procedure showing the outline of the control mechanisms with the guts of the non-thread related code elided?

      I'm not typing ^C and noone else is logged in.

      Something is causing your process to receive a SIGINT. It may be that your SIGSTOP is being internally translated into a SIGINT by the signals emulation code--the Perl signals emulation on windows does not directly support SIGSTOP. Or this could be some uncharted interaction between the signals emulation in the core and that layered on top by the threads signals. (Which should never have been added in the first place IMO.)

      Again, if you can post your code--with most of the SNMP stuff elided --it might be possible for me to re-create the problem locally and track down the source.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

        Hi,

        Thanks for the suggestion of using a shared array for $end_ne . I was initially using a signal to kill the thread and the restart it, when I change dto just setting an array I didn't think to use the array. Will change that.

        You asked me to post the command and control code:

        my @hosts = keys %nes; my @hosts_ok; my $max_threads = $config->val($section,'MaxThreads'); for my $threadno (0 .. ($max_threads-1)){ if (@hosts){ my $host = shift (@hosts); $thread_ip[$threadno] = $host; $thread_result[$threadno] = 0; $thread_time[$threadno] = time(); push @threads, threads->create('get_ne_data',$threadno); $threads[@threads-1]->detach(); write_log ('MAIN','STARTING THREADS',1,'001b','Starting thread +',$threads[@threads-1]->tid(),'NE',$host); sleep(1); } } while (@hosts){ for my $threadno (0 .. (@threads-1)){ if ($threads[$threadno] and $threads[$threadno]->is_running() +){ if ($thread_result[$threadno]){ # thread as returned if ($thread_result[$threadno] == 2){ write_log ('MAIN','PROCESSING NES',2,'001c','Threa +d',$threads[$threadno]->tid(),'NE',$thread_ip[$threadno],'Finished OK +'); push @hosts_ok, $thread_ip[$threadno]; } else { write_log ('MAIN','PROCESSING NES',2,'001d','Threa +d',$threads[$threadno]->tid(),'NE',$thread_ip[$threadno],'Finished NO +K'); } $thread_time[$threadno] = time(); $thread_result[$threadno] = 0; if (@hosts){ $thread_ip[$threadno] = shift @hosts; } else { # print "\tFINISHED\n"; $thread_ip[$threadno] = 'FINISHED'; } write_log ('MAIN','PROCESSING NES',2,'001e','Thread',$ +threads[$threadno]->tid(),'is being assigned',$thread_ip[$threadno]); } elsif ($thread_time[$threadno] < (time-60) ){ write_log ('MAIN','PROCESSING NES',2,'001f','Thread',$ +threads[$threadno]->tid(),'NE',$thread_ip[$threadno],'Being killed'); # print "\t$thread_ip[$threadno] is being killed\n"; $threads[$threadno]->kill('STOP'); $thread_time[$threadno] = time(); $thread_result[$threadno] = 0; if (@hosts){ $thread_ip[$threadno] = shift @hosts; } else { # print "\tFINISHED\n"; $thread_ip[$threadno] = 'FINISHED'; } write_log ('MAIN','PROCESSING NES',2,'0020','Thread',$ +threads[$threadno]->tid(),'is being assigned',$thread_ip[$threadno]); } } else { # for some reason we don't have a thread here - possibly sto +pped due to long run time # print "\tTrying to restart thread\n"; if (@hosts){ my $host = shift (@hosts); $thread_ip[$threadno] = $host; $thread_result[$threadno] = 0; $thread_time[$threadno] = time(); $threads[$threadno] = threads->create('get_ne_data',$t +hreadno); $threads[$threadno]->detach(); write_log ('MAIN','PROCESSING NES',4,'0021','THREAD NE +EDS RESTARTING',$threads[$threadno]->tid(),'NE',$thread_ip[$threadno] +); } } } sleep(1); } write_log ('MAIN','',2,'0022','All hosts started'); my $threads_running = 1; while ($threads_running){ $threads_running = 0; for my $threadno (0 .. (@threads-1)){ if ($threads[$threadno] and $threads[$threadno]->is_running() +){ $threads_running++; if ($thread_result[$threadno]){ # thread as returned if ($thread_result[$threadno] == 2){ write_log ('MAIN','CLEARUP',2,'0022','Thread',$thr +eads[$threadno]->tid(),'NE',$thread_ip[$threadno],'Finished OK'); push @hosts_ok, $thread_ip[$threadno]; } else { write_log ('MAIN','CLEARUP',2,'0023','Thread',$thr +eads[$threadno]->tid(),'NE',$thread_ip[$threadno],'Finished NOK'); } $thread_ip[$threadno] = 'FINISHED'; $thread_result[$threadno] = 0; } elsif ($thread_time[$threadno] < (time-60) ){ write_log ('MAIN','CLEARUP',2,'0024','Thread',$threads +[$threadno]->tid(),'NE',$thread_ip[$threadno],'Being killed'); $threads[$threadno]->kill('STOP'); $thread_time[$threadno] = time(); $thread_result[$threadno] = 0; $thread_ip[$threadno] = 'FINISHED'; } } } }

      I don't think that mixing signals and threads actually works. Especially under Windows, where signals are emulated at best.