JoeKamel has asked for the wisdom of the Perl Monks concerning the following question:

Hey all, sorry if this is a silly question in anyway, fairly new to perl. I'm using perl 5.8.8 and don't really have much freedom to upgrade to 5.10 -- I'm running under linux (a slightly customized fc 6 build; based on the 2.6.22.9 kernel

I'm in the undesirable position of putting a wrapper around a set of binaries (these are tests for racks(s) of equipment) that someone else has created in order to execute all of these binaries, parse the output, etc and provide a nice unified user view of the results.

I initally tried doing this with fork and pipe, but found that the performance of the pipes was fairly poor (i.e. stdout would get bogged down and some tests would fail due to timeouts).

I have now moved to a thread / Thread::Queue implementation, and performance is much better, but every now and again (not all the time), as the main script exits, i get:
A thread exited while 2 threads were running.
or
A thread exited while 2 threads were running. panic: MUTEX_LOCK (22) [op.c:352] during global destruction.
or something along those lines (sometimes i get a segfault for good measure as well).
Here's a snipped of the code that actually launches the threads.
my $thread = async { my $tid = threads->tid(); my ( $out, $in ); chdir $wd or die "Can't cd to $wd: $!\n"; my $pid = open2( $out, $in, $cmd ); # die "$!\n"; print $in "$_\n" foreach (@input); close $in or warn "close of input: $! $?\n"; my $err = 0; while (<$out>) { chomp; $err = 1 if (/^thread failed/); $Q->enqueue("$tid:$uut:$test:$_"); last if ($err); } kill( 2, $pid ) if $err; close $out; $Q->enqueue("$tid:$uut:$test:ENDEND"); waitpid ($pid, 0); }; $thread->detach();
then in the main script (thread 0), i sit around and pull on the queue.
my $tcount = count of tests that are running. while (1) { if ( $Q->pending ) { $_ = $Q->dequeue; chomp; if (/ENDEND$/) { $done++; print "\t-->got $done/$t_count $_\n"; } last if ( $done == $t_count ); next if (/ENDEND$/); } }

So, i would have thought that since i only send myself the ENDEND message once the open2 is done, and i make sure that i match up all of the tests i started with an end each that I would be okay to exit from the main program cleanly, but this is not so....

is there anyway that i can explicitly kill the threads? reading around I see the suggestion from the CPAN Threads module that maybe one should just turn these warnings off, but that doesn't seem right...

Any thoughts or ideas on how i can more likely have clean exits every time around?

Replies are listed 'Best First'.
Re: cleanly exiting threads
by ikegami (Patriarch) on Aug 13, 2008 at 17:25 UTC

    As I understand things, open2 forks (duplicating all the threads) and calls _exit if there's an error calling exec (causing threads to be exited prematurely in the child). I think your problem is related to that.

    And isn't the work directory a per-process state? If so, there's a race condition in your code.

    thread 1: chdir thread 2: chdir thread 2: open thread 1: open OOPS, chdir isn't as expected

    And another bug: You use open2 without waitpid, or do you set $SIG{PIPE} to handle that elsewhere?

      thanks for the pointer on the potential race condition. I haven't seen that as a problem yet, but i can see where it could be.

      I do have a waitpid, just lost it when copying. (i also have a lot of other things i've tried and commented out in that area).

      when open2 does fail, i actually a message stuck into $out (i assume this is because my $cmd has stderr redirected to stdout).

      this is what i am matching with:
      $err = 1 if (/^thread failed/);
      and in that case i exit from the loop and send myself the done message.

      I had suspected that this (the open getting stuck) was the problem which is why i also put in the extra kill just to make sure, but at this point, all of my tests are actually finishing successfully, (i.e. all output coming out, and no spurious tests left on any of the severs).

      thank you very much for this. the race condition was causing an infrequent, but mysterious tests refuse to start condition.
Re: cleanly exiting threads
by BrowserUk (Patriarch) on Aug 13, 2008 at 22:55 UTC

    On the basis of what you've said, it seems likely that the problem arises because you are using IPC::Open2 in conjunction with threads. IPC::Open2 "forks" the command to be processed. Under Win32, fork is implemented under the covers using a pseudo-process--which is a thread!

    It seems likely that if you are sure that all your threads are ending cleanly, that the extra thread is that created under the covers by the call to open2().

    A few possibilities come to mind, but are untested.

    The 'pid' returned from open2() is actually a thread id. Use $threadObj = threads->object( $pid ); to obtain a handle to that thread and then either:

    • $threadObj->join;
    • or $threadObj->kill( 'STOP' );

    prior to exiting the program.

    Or, if your code is otherwise working correctly except for the exit time warning, accept it is just a warning and ignore it.

    Other possibilities that might work, but without more information (code to look at and try) are just speculation, include:

    • Using kill on $pid.
    • Using something other than open2 to run the process. Eg. Win32::Process or a piped open. etc.

    Really, the first thing you need to determine is whether the extra running thread is one you have created or one started by open2. The best way to determine that is by using something like ProcessExplorer to determine what threads are still running when the program s about to exit. ProcessExplorer allows you to view the process and examine the threads including their system-assigned thread IDs (which are different from those used by the threads!).

    The problem here is that there are four identifiers for every Perl created thread:

    1. The system assigned ThreadID. A system-wide unique non-zero integer.
    2. The system assigned thread handle. An opaque object handle to system internal data structures.
    3. The threads object handle. A Perl assigned blessed reference.
    4. The Perl-assigned, process-unique integer.

    threads has a private method:_handle() which will give you the system assigned thread handle associated with a given (or current) Perl threads object. However, the identifier in ProcessExplorer is the system assigned thread ID. In order to discover which thread is still running at termination, it will be necessary to inspect the process (with ProcessExplorer) and note the thread IDs of the still running threads and relate them to the threads you've started and finished.

    The only way I know of doing that is to use the system API GetCurrentThread(), but that will involve using Inline C, or XS, or Win32::API.

    If you can determine for sure what thread is still running, then you have a start on working out why that is the case. If you can produce a runnable, cut-down version that demonstrates the problem, I might have a go at trying to come up with a solution.


    1. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      Okay, I'm running under linux so my understanding that an open2 is actually a fork, correct?

      I have verified that my

      $kpid = waitpid ($pid, 0);
      at the end of the while loop is returning with $kpid always matching $pid -- which i believe means that the process is actually done.

      Having said that, I have no idea what the binaries that I am running with the open command may be doing. they maybe not quite finishing, so, maybe doing an explicit kill would be better than just waiting...or perhaps send the sigint, (which they all handle), then leave the waitpid in place. may be more reliable

      So thanks for the thoughts, i'll have to dig in again tomorrow.
        I'm running under linux

        Whoops! I completely missed that detail. Which kinda renders most of what i wrote redundant.

        In that case I can't offer much information, only an inate suspicion of mixing threads and fork. I did have some more to say about that, but reviewing it, I cannot find anything to back up my suspicions, so I'll just back out of this and wish you good luck.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: cleanly exiting threads
by zentara (Cardinal) on Aug 13, 2008 at 21:12 UTC
    I just had an afterthought I how you may experiment to force killing your detached threads. Use a shared variable $die and in your loop have return if $die. I'm not sure where to put it exactly in your code, but it would be a way to force breaking out of the while (<$out>) loop. It will force all detached threads to return. If it works for you, you can setup a shared hash, with a $shash{$thread_num}{'die'} for each thread, so you can control which threads get the $die=1 and return. You may also add a sub to perform in the thread when the die is received, like possibly closing the $out filehandle before returning. This is why I don't use Queue or any of the other thread conveyer belt modules, they separate you from full control.
    use threads::shared; my $die : shared; #declare as shared before setting value $die = 0; while (<$out>) { return if $die; chomp; $err = 1 if (/^thread failed/); $Q->enqueue("$tid:$uut:$test:$_"); last if ($err); } while (1) { ....... ........ # put $die at an appropriate spot $die =1; ...... }

    I'm not really a human, but I play one on earth Remember How Lucky You Are
      Thanks for this, it has greatly helped with some cases where the tests I'm running actually hang or fail (due to bad hardware or inserted faults).

      Here's what I ended up actually implementing

      then I call set die in one of two ways: either via setting it at the end of the while(1) loop, or when an alarm is triggered

      At this point, I never have tests that mysteriously fail to start. (thanks for the pointer on the possible race condition), I always have every test that is started send its end to the results receiver (sometimes being voilently killed via the alarm and setting of die, which does go and kill the pid for the open process, close out, etc), and yet even still, I always have two threads upon exit; even when nothing goes wrong and everything completes normally.

      I'm going to try a little bit with the join method instead of detach, but at this point i'm almost inclined to turn on ignore for them, as one of the monks had suggested. Since the OS seems to be cleaning them up for me okay...

      however, any other thoughts would be welcome

        Yeah it's gets tricky because each thread gets a copy of the parent at the time of creation, it's a Perl problem, not a c thread problem. That means a thread may have duplicated code from a previous thread in it, just sitting around keeping a refcount > 0. Hard to track down.

        The only absolutely foolproof way of doing it, is to create all threads at the beginning of the script, then REUSE the threads, over and over. See Reusable threads demo for the basic idea.

        Otherwise, you may be best just turning off the warning, and watch for weird glitches or memory gains as your script runs.

        All I can say is good luck, because I have run into similar problems many times, and now immediately use sleeping reusable threads right from the start of design. I don't even consider detaching, as it almost always leads to memory gains unless you somehow reuse the thread's scalar namespace. Also spawning threads is pretty intensive, so you want to minimize it......reuse threads and join at exit.


        I'm not really a human, but I play one on earth Remember How Lucky You Are
Re: cleanly exiting threads
by zentara (Cardinal) on Aug 13, 2008 at 18:32 UTC
    A thread exited while 2 threads were running. panic: MUTEX_LOCK (22) op.c:352 during global destruction.

    The first error may be due to your detached threads not returning. A thread must return (or reach the end of it's code block) before it can be joined or cleanly exited. Your thread may be hanging in that while<$out> loop.


    I'm not really a human, but I play one on earth Remember How Lucky You Are
      I would agree, except that i get this message even when I recieve the ENDEND on the other end of the queue for every test (or thread, since I have 1 thread per test + the one to rule them all).

      in perl 5.8.8 is there any equivalent of thread->exit()? Can I explicitly kill threads in anyway?

      interestingly enough if i do a threads->list() right before the main script exits, I don't get any response.

      In this run, I only have two threads, (i.e. one test running), plus thread 0 and it is still reporting exiting with two running.

      The code:
      my @list = threads->list(); print Dumper(\@list); print "\ntotal time: ", format_time( tv_interval($t0) ), "\n"; __END__
      The output:
      $VAR1 = []; total time: 0:1:1.26 A thread exited while 2 threads were running.
        I don't use Thread::Queue myself, but I had this old sample around that says it's kindof obsolete, I may be wrong. Maybe try this style?
        #!/usr/bin/perl #leaks memory- Threads::Queue is for the old Threads <5.7 use strict; use threads; use Thread::Queue; my @threads; my $thread; my $q = Thread::Queue->new(); $q->enqueue(1..100000); print "Items in the queue: ",$q->pending,"\n"; for (1..5) { push @threads, threads->new(\&ttest); print "spawned thread:"; } foreach $thread (@threads){ $thread->join; } sub ttest { while (my $cnt = $q->pending) { my $item = $q->dequeue; print "$cnt\n"; last if $cnt == 0 ; } }

        I'm not really a human, but I play one on earth Remember How Lucky You Are