perluser09 has asked for the wisdom of the Perl Monks concerning the following question:

Hello, I have written a short perl script that creates two threads using async. There is a shared array @array, with this script I expect each thread to first lock @array, then pop out 10 elements from @arrays, push them into @my_oids, release the lock and later print each element in @my_oids. Both the threads keep doing this as long as there are elements left in the array @array. I have written this script on windows and execute it using the cmd shell. The output shows that the script appears to be running fine and both the threads are doing part of the work.
C:\Users\perluser09\Desktop\Perl>perl lock.pl ... Thread 1: Doing SNMP GET for 2.1.1.1:1.3.1.8.47 Thread 1: Doing SNMP GET for 2.1.1.1:1.3.1.8.46 Thread 1: Doing SNMP GET for 2.1.1.1:1.3.1.8.45 Thread 2: Doing SNMP GET for 2.1.1.1:1.3.1.8.40 Thread 1: Doing SNMP GET for 2.1.1.1:1.3.1.8.44 Thread 2: Doing SNMP GET for 2.1.1.1:1.3.1.8.39 Thread 2: Doing SNMP GET for 2.1.1.1:1.3.1.8.38 Thread 1: Doing SNMP GET for 2.1.1.1:1.3.1.8.43 ..... Perl exited with active threads 0 running and unjoined 2 finished and unjoined 0 running and detached
But, when I try to redirect the output of the script to a file and run the perl script as below, Thread 1 seems to do all the jobs and Thread 2 is missing completely.
C:\Users\perluser09\Desktop\Perl>perl lock.pl > threadinfo.txt .... Thread 1: size of the array is 199 Thread 1: Doing SNMP GET for 2.1.1.1:1.3.1.8.200 Thread 1: Doing SNMP GET for 2.1.1.1:1.3.1.8.199 Thread 1: Doing SNMP GET for 2.1.1.1:1.3.1.8.198 Thread 1: Doing SNMP GET for 2.1.1.1:1.3.1.8.197 Thread 1: Doing SNMP GET for 2.1.1.1:1.3.1.8.196 Thread 1: Doing SNMP GET for 2.1.1.1:1.3.1.8.195 Thread 1: Doing SNMP GET for 2.1.1.1:1.3.1.8.194 Thread 1: Doing SNMP GET for 2.1.1.1:1.3.1.8.193 Thread 1: Doing SNMP GET for 2.1.1.1:1.3.1.8.192 Thread 1: Doing SNMP GET for 2.1.1.1:1.3.1.8.191 ...... Perl exited with active threads: 0 running and unjoined 2 finished and unjoined 0 running and detached
What I discovered is that when I do not do any redirection, the two threads are created almost one after another, but when I use redirection, the second thread is created only after the first thread has completed the while loop, which means performed the entire operation. I would be grateful if someone would point out the reason for such behavior. The full code for the script is:
#create the array with the ip:OID use warnings; use strict; use Thread qw(async); use threads::shared; share (my @array); @array = qw/2.1.1.1:1.3.1.8.1 2.1.1.1:1.3.1.8.2 2.1.1.1:1.3.1.8.3 +2.1.1.1:1.3.1.8.4 2.1.1.1:1.3.1.8.5 2.1.1.1:1.3.1.8.6 2.1.1.1:1.3.1.8 +.7....sequence continues upto....2.1.1.1:1.3.1.8.199 2.1.1.1:1.3.1.8. +200/; my $thr1 = async { while($#array > 0) { print "Thread 1: size of the array is $#array\n"; my @my_oids; { lock (@array); # Block until we get access to $a for(my $i=0;$i<10;$i++) { push (@my_oids, (pop @array)); } } foreach(@my_oids) { print "Thread 1: Doing SNMP GET for $_\n" if (defined( +$_)); } } }; my $thr2 = async { while($#array > 0) { print "Thread 2: Size of the array is $#array\n"; my @my_oids; { lock (@array); # Block until we get access to $a for(my $i=0;$i<10;$i++) { push (@my_oids, (pop @array)); } } foreach(@my_oids) { print "Thread 2: Doing SNMP GET for $_\n" if (defined( +$_)); } } }; $thr1->join; $thr2->join;

Replies are listed 'Best First'.
Re: Redirecting output in Windows cmd prevents second thread from doing anything
by SimonPratt (Friar) on Jan 15, 2016 at 14:33 UTC

    My immediate expectation is that when you redirect the output it buffers it, allowing thread 1 to complete the entire workload before thread 2 has had a chance to start (threads take a long time to start in Perl, due to their heavy nature). Yes, the output is also buffered when printing to STDOUT, but \n forces a buffer flush in this instance. You can very easily test this by chucking a sleep into your loop.

    Beyond this, I'm not quite sure why you're threading, however a slightly different method of doing the same thing (without having to worry about the dangers of share'ing and lock'ing data structures) would be to use Thread::Queue to pass what is currently being contained in your array.

    A solution using Thread::Queue might look something like this:

    use strict; use warnings; use threads; use Thread::Queue; my $queue = Thread::Queue->new(); my @threads = map {threads->new(\&worker)} 1..2; my @array = qw(2.1.1.1:1.3.1.8.1 ... 2.1.1.1:1.3.1.8.199); $queue->enqueue(@array); # Give work to threads after threads created $queue->end(); # Must do this, otherwise threads will never +end $_->join for @threads; # Join threads to avoid thread errors at end of + script sub worker { my $tid = threads->tid(); print "Starting thread $tid\n"; while (my $oid = $queue->dequeue()) { print "Thread $tid: Doing SNMP GET for $oid\n"; } print "Finishing thread $tid\n"; }
    Note: code untested

      Thank you SimonPratt, The code you have suggested is very elegant!

        You're welcome :-)

        Not sure if you've noticed, but the solution I provided is capable of having the number of threads dialled up and down very easily. In fact, with very minor modifications, you can even create a non-threaded script (be sure to remove use threads and use Thread::Queue when you do this). I strongly suggest that you do this for testing (make sure to run your full workload, rather than just a print statement) - It will let you know if threading is actually helpful for your workload and if so let you dial in the optimal number of threads pretty rapidly.

Re: Redirecting output in Windows cmd prevents second thread from doing anything
by BrowserUk (Patriarch) on Jan 15, 2016 at 19:50 UTC

    You are misinterpreting what you are seeing. (As is everyone else who has replied.)

    The "problem" is simply that given the size of your data, thread 1 has completed processing the entire array before thread 2 gets a chance to do anything.

    You can prove this by increasing the volume of element in the array -- around 4000 should suffice, I simply duplicated the 1..200 20 times -- then you'll see that whilst the first couple of thousand are all processed by thread 1; after that, thread 2 gets a look-in and they start alternating.

    You'll probably also notice that the logging starts getting mixed together -- partial lines from one thread being tacked onto the end of partial lines from the other -- but that's a different problem.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
    In the absence of evidence, opinion is indistinguishable from prejudice.

      " (As is everyone else who has replied.) "

      Even this monk in this post when he wrote:

      ... allowing thread 1 to complete the entire workload before thread 2 + has had a chance to start ...
      ??


      The way forward always starts with a minimal test.

        If you are selective with your quoting you can suggest anything.

        The full, relevant quote is:

        My immediate expectation is that when you redirect the output it buffers it, allowing thread 1 to complete the entire workload before thread 2 has had a chance to start

        Which I'll break down into 3 parts in reverse order:

        1. allowing thread 1 to complete the entire workload before thread 2 has had a chance to start

          No mention that the small size of the sample dataset gives only the appearance of the identified a "problem".

          Nor that if the dataset was large enough to actually warrant using two threads, then there is no problem.

        2. when you redirect the output it buffers it

          The symptoms, wrongly identified as a "problem", have exactly nothing to do with buffering. Turn buffering off and the symptoms do not change at all.

        3. My immediate expectation is

          Giving responses based upon expectations, when it took about 2 minutes to completely disprove the theory, is willfully misleading.

        And the "solution" offered to the (non)"problem"; is just completely unnecessary.

        So, yes!


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :)
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Redirecting output in Windows cmd prevents second thread from doing anything
by dasgar (Priest) on Jan 15, 2016 at 19:23 UTC

    First, just wanted to point out that the documentation for the Thread module indicates that it is deprecated and recommends users to use the threads module instead (see DEPRECATED section). That has nothing to do with your question, but might help you to avoid other issues down the road.

    For creating a log of everything from all of the threads, you could create a variable in each thread that would contain the log contents of that thread and have the thread return that value. Then the main code section can print that out to a file afterwards.

    Here's an untested modification of your code that will does what I just described.

    Of course, one drawback on that would be that if a thread were to die prematurely, you would lose all logging for that thread. Since you are already using a shared variable for the threads to read from, you could also use one or more shared variables that the threads could use for logging. And again, write that out to file after joining all of the threads.

    And if you are needing/wanting the logging to be intermixed, just add a timestamp for each log entry from each thread. Then you could sort the entries from all of the logs afterwards to create a single chronologically ordered log from the contents of all of the thread logs.