xiaoyafeng has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks,

I'm new to threads, but I've heard of much benifit about threads for long time. So I decide to give up fork and choose threads in my next programme.
But I'm stuck at very start:( Below snippet is from perlthrtut, very simple.
use threads; $Param3 = "foo"; $thr = threads->new(\&sub1, "Param 1", "Param 2", $Param3); $thr = threads->new(\&sub1, @ParamList); $thr = threads->new(\&sub1, qw(Param1 Param2 Param3)); sub sub1 { my @InboundParameters = @_; print "In the thread\n"; print "got parameters >", join("<>", @InboundParameters), "<\n" +; } __OUTPUT__ D:\>perl thr_test.pl In the thread got parameters >Param 1<>Param 2<>foo< In the thread got parameters >< A thread exited while 2 threads were running. D:\>perl thr_test.pl In the thread got parameters >Param 1<>Param 2<>foo< A thread exited while 3 threads were running. D:\>perl thr_test.pl In the thread got parameters >Param 1<>Param 2<>foo< A thread exited while 3 threads were running.
I found that output is different with the running times. According to warnings, it seems some threads already quit when another thread running. All threads must quit all together? That's odd! Then I add 'sleep' in the sub route to try to delay thread quit like below.
use strict; use warnings; use threads; no warnings 'threads'; my @ParamList = qw (1 2 3); my $Param3 = "foo"; my $thr = threads->new(\&sub1, "Param 1", "Param 2", $Param3); $thr = threads->new(\&sub1, @ParamList); $thr = threads->new(\&sub1, qw(Param1 Param2 Param3)); sub sub1 { my @InboundParameters = @_; sleep 10; print "In the thread\n"; print "got parameters >", join("<>", @InboundParameters), "<\n" +; } __OUTPUT__ D:\>perl thr_test.pl A thread exited while 4 threads were running. D:\>perl thr_test.pl A thread exited while 4 threads were running. D:\>perl thr_test.pl A thread exited while 4 threads were running. D:\>perl thr_test.pl A thread exited while 4 threads were running.
Warnings is still tossed although the status seems better than before.

could any suggestions enlighten me? Thanks in advance!


I am trying to improve my English skills, if you see a mistake please feel free to reply or /msg me a correction

Replies are listed 'Best First'.
Re: Threads question
by BrowserUk (Patriarch) on Jun 29, 2007 at 04:55 UTC

    Unfortunately, perlthrtut is probably the single worst piece of documentation in the entire perl docset. And the second worst advert for Perl + threads.

    The warning you are getting is indicating that the first thread started in your program--that is, the thread started when the program is run; the thread from which you create your other 3 threads; sometimes referred to as the main thread--ended, because it 'ran off the end of the script', before the other 3 threads had ended.

    And, because the main thread, in a threaded program, is also the process, when the main thread dies, all other threads die also.

    The reason the some or all of the other 3 threads hadn't died or 'gone away' earlier, despite that at least one or two of them had finished doing whatever they were doing, is because threads can return values, just like processes can. And just as with fork, they don't die until someone collects those values. With fork you have to call wait or waitpid to retrieve the process return values and clean up the zombie processes.

    If you look at the very next section to where you copied the above code, "Waiting For A Thread To Exit", then you see them introduce the join() method, which is the threads equivalent of waitpid. There is no equivalent of wait(More's the pity.)

    *wait. There is no equivalent of waitpid*. Corrected per ikegami's post++ below

    So, to retrieve the return values from your 'zombie', 'peer' threads (there is no parent-child relationship), you have to call join, even if you discard the return values having done so.

    Adding this to the example code, ensures that:

    1. The main thread lives long enough for the other threads to do what they need to do and exit.
    2. That those threads resources are cleaned up before the process exits, thereby avoiding the warning message you have been seeing.
    use threads; $Param3 = "foo"; $thr1 = threads->new(\&sub1, "Param 1", "Param 2", $Param3); $thr2 = threads->new(\&sub1, @ParamList); $thr3 = threads->new(\&sub1, qw(Param1 Param2 Param3)); sub sub1 { my @InboundParameters = @_; print "In the thread\n"; print "got parameters >", join("<>", @InboundParameters), "<\n"; } $thr1->join; ## Wait for thread 1 to finish $thr2->join; ## Wait for thread 2 to finish $thr3->join; ## Wait for thread 3 to finish __END__ C:\test>junk2 In the thread got parameters >Param 1<>Param 2<>foo< In the thread got parameters >< In the thread got parameters >Param1<>Param2<>Param3<

    There is also another method of avoiding the need to call join() on your threads. You can detach() them.

    This is roughly equivalent to the *nix double forking daemonisation process, whereby the process forks twice, and the child terminates leaving the grandchild running. This breaks the parent-child-grandchild relationship and ensures that the parent doesn't have to hang around to clean up the grandchild.

    With my apologies if this description is flawed. I've done very little with fork ever, and what I did do, was a long time ago. Hopefully, the analogy is still good

    Eg.

    use threads; $Param3 = "foo"; threads->new(\&sub1, "Param 1", "Param 2", $Param3)->detach; threads->new(\&sub1, @ParamList)->detach; threads->new(\&sub1, qw(Param1 Param2 Param3))->detach; sub sub1 { my @InboundParameters = @_; print "In the thread\n"; print "got parameters >", join("<>", @InboundParameters), "<\n"; } ## wait long enough to ensure that the threads ## will have completed before we let the main thread ## and the process die. sleep 10; __END__ C:\test>junk2 In the thread got parameters >Param 1<>Param 2<>foo< In the thread got parameters >< In the thread got parameters >Param1<>Param2<>Param3<

    But note: it is still necessary to ensure that the main thread sticks around long enough for the other thread to finish. Using a sleep as above is not recommended for any real purpose. This makes detach() pretty useless as there is no way to query whether any detached threads are still running*.

    *Even the extended forms of the list() class method in the updated cpan version still do not give access to this information.

    I do not know of the existance of any better tutorial material for ithreads than perlthrtut. Whilst it would be possible to clean up the contents of that document--adding use strict; and join() etc. to the samples--via some judicious document patches, it would do little to improve its overall quality as a tutorial. It needs to be re-written so as to demonstrate usable techniques for working with ithreads--a set of design patterns if you will.

    Whilst I have acquired enough experience of using ithreads to be able to write reasonable programs for most situations I've encountered where ithreads are applicable, my lack of a multi-cored/multi-processor testbed means that all my techniques and tests are strictly limited to a single processor environment. Trying to project that knowledge to cover usage in multi-processor environments--never mind other platforms--is just too difficult a mindware project to undertake.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      the join() method, which is the threads equivalent of wait. There is no equivalent of waitpid. (More's the pity.)

      That's backwards. join waits for a specific thread to end, just like waitpid($pid) waits for a specific process to end, and waitpid($pid, WNOHANG) is available as is_joinable. What's missing is the ability to wait for any thread to end (the equivalent of wait and waitpid(-1)).

      wait could be emulated as follows:

      use Time::HiRes qw( sleep ); sub wait_threads { for (;;) { my @joinable = threads->list(threads::joinable); if (@joinable) { return shift(@joinable)->join(); } sleep(0.010); } }
Re: Threads question
by TOD (Friar) on Jun 29, 2007 at 02:31 UTC
    two points: first, perl threads cannot be started and forgot. you must explicitly call the join() method on each of your threads.
    second, for the unpredictable output: set the $| variable to an nonzero value, by which you force perl to autoflush any output. this may be done globally, or within the printing sub:
    local $| = 1
    you will find further explanations in perldoc perlvar

    --------------------------------
    masses are the opiate for religion.
      Actually, I copy the code from perlthrtut.
      I don't see any comment or code at there that let me add $|++ or use join method on each thread. Anyway, if you are right, I doubt whether thread tutorial is clear enough.

      I am trying to improve my English skills, if you see a mistake please feel free to reply or /msg me a correction
      It's not a buffering issue. You have two threads trying to access the same resource (STDOUT). Whenever that happens, you have to use a locking mechanism to ensure a thread is done with the resource before another thread can use it.
Re: Threads question
by jdrago_999 (Hermit) on Jun 29, 2007 at 04:01 UTC
    You need to wait for all of your threads to join up with the parent:
    use threads; # Create 5 threads: threads->create(sub { print "Hello from # $_[0]\n" }, $_ ) for 1...5; # Then when you want to just let your threads work: $_->join foreach threads->list; # Now the threads are done: print "threads are all done!";
Re: Threads question
by zentara (Cardinal) on Jun 29, 2007 at 11:47 UTC
    One thing that hasn't been mentioned clearly, is how to cleanly end a thread prematurely. All of the discussion above assumes that the thread will finish it's processing then be joined..... but what if you want to join the thread BEFORE it has completed it's code block?

    The basic rule to remember, is that in order for a thread to be joined, it must reach the end of it's code block, OR do a return. You usually use a shared variable, such as $die, to signal the thread to return. Then you can have the main thread set $die=1, then in the thread code have

    return if $die;
    at crucial code points.

    This brings up the next big issue you will face, and that is thread code being repeatedly called, can cause a memory gain, because of the imperfect garbage collection of Perl. So..... if your program needs to constantly spawn and join threads, you may need to setup a scheme to reuse threads, and put them in a sleep loop when you don't need them, and feed them a wakeup call and fresh data when you need them.

    There are many examples of this in the nodes here at perlmonks, just search for "threads" and look at all the code examples.( Or search groups.google.com for "perl threads")

    A final point, changes in shared variable's values are not automatically seen between threads. Each thread needs to run a timer or loop to constantly check them.


    I'm not really a human, but I play one on earth. Cogito ergo sum a bum
      This brings up the next big issue you will face, and that is thread code being repeatedly called, can cause a memory gain, ...

      When was the last time you observed this behaviour?

      Running the following snippet for 10,000 thread creation/death cycles, the memory bobs up and down (under win32 thread local memory gets returned to the system). The memory usage fluctuates between 7.4MB and 11.3 MB and ends at 7.1 MB just before it ends.

      So there is no obvious pattern of growth (and no big issue I can see), despite the closures, under 5.8.6 or 5.8.8. I haven't tried it under 5.9.x yet.

      If you could post code,or modify that below, to demonstrated the problem ?

      #! perl -slw use strict; use threads; use threads::shared; our $N ||= 1000; our $aClonedGlobal = 12345; my $aClonedLexical = 12345; our $aSharedGlobal :shared = 12345; my $aSharedLexical :shared = 12345; my $running :shared = 0; for ( 1 .. $N ) { async{ { lock $running; ++$running } my $tid = threads->self->tid; my( $some, $thread, $local, $vars ) = (12345) x 4; require Carp; require IO::Socket; require Time::HiRes; print "$tid: $aClonedGlobal : $aClonedLexical : $aSharedGlobal + : $aSharedLexical"; { lock $running; --$running }; }->detach; } sleep 1 while $running;
      A final point, changes in shared variable's values are not automatically seen between threads. Each thread needs to run a timer or loop to constantly check them.

      I don't understand what you mean by this? Shared vars are tied. Every time you reference one, the current value is retrieved from the master copy. How can "changes ... not [be] automatically seen between threads.", unless you reference them?

      May be I am missing something? Again, could you post an example to demonstrate what you mean by this please?


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        Well I ran your code on linux( upping the count to 5000), and it starts out at about 4.4% mem (I have a gig of ram). After 2 minutes, it was up to 12.4 % ram. That is roughly a doubling of mem every minute...... that's what I'm talking about.

        Also your code is quite simple, meaning it uses modules that "probably" are quite thread safe. I would next start trying to run something like WWW::Mechanize, or LWP::UserAgent in the threads, and see what they add to the garbage leftovers. Maybe Win32 does a better job with garbage collection than linux?

        I don't understand what you mean by this? Shared vars are tied. Every time you reference one, the current value is retrieved from the master copy.

        In a simple example you are right. I'm talking about (and I should have mentioned it) using GUI's with threads. Gui's will allow you to tie, but that won't work across threads. Here is a simple example, the count is not automatically reflected to the thread.

        #!/usr/bin/perl use strict; use warnings; use threads; use threads::shared; use Tk; my $count:shared=0; my $thread = threads->new( \&launch_thread )->detach; my $mw = MainWindow->new(); $mw->geometry("600x400+100+100"); my $l1 = $mw->Label(-textvariable => \$count)->pack(); MainLoop; sub launch_thread { while (1){ $count++; print "$count\n"; sleep 1; } }

        I'm not very good with a simple Tie, but the following dosn't trigger the fetch or store, in a non-gui script.

        #!/usr/bin/perl use strict; use warnings; use threads; use threads::shared; use Tie::Watch; my $count:shared=0; my $thread = threads->new( \&launch_thread )->detach; my $watch = Tie::Watch->new( -variable => \$count, # -debug => 1, -fetch => \&fetch, -store => \&store, -destroy => sub {print "Final value=$count.\n"}, ); while(1){} sub launch_thread { while (1){ $count++; print "$count\n"; sleep 1; } } sub store { print "changed\n"; } sub fetch{ print "fetched @_\n"; }

        Maybe you can make the simple Tie report a store?


        I'm not really a human, but I play one on earth. Cogito ergo sum a bum
Re: Threads question
by shmem (Chancellor) on Jul 01, 2007 at 11:01 UTC
    So I decide to give up fork and choose threads in my next programme.

    That sounds like you would never ever use fork again, which feels wrong.

    Use the right tools for the job - it's almost always fork (on *nix), and somtimes threads, or almost always threads (on Windows) and rarely fork there .-)

    --shmem

    _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                  /\_¯/(q    /
    ----------------------------  \__(m.====·.(_("always off the crowd"))."·
    ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}