capturing output of system call inside a thread

that_guy has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks. I've been working on a multithreaded script and have trouble figuring out loggin/printing portion of it.

Each of the worker threads is essentially independent and can run perl code as well as do system() calls. I was originally using Capture::Tiny module and it worked fine enough except when system calls are involved. Whenever system calls come into play, all output gets jumbled up and mixed up.

Can this logging approach be done? If not is there another way? If this cannot be done with threads, can this kind of parallel processing/logging be done somehow differently?

Main script

#!/usr/local/bin/perl

use strict;
use warnings;

use threads;
use Capture::Tiny qw/capture_merged/;
use IO::File;
use IPC::Run qw/run/;
use autodie qw/open close/;
use FindBin qw/$RealBin/;

my $thread1 = threads->create ( \&worker, 1, 'sub1.log' ); 
my $thread2 = threads->create ( \&worker, 2, 'sub2.log' ); 

my $need_to_continue = 1;
while ($need_to_continue) {
    print "MAIN THREAD IS PRINTING\n";
    if ( $thread1->is_running() || $thread2->is_running() ) {
        sleep 3;
    } else {
        $need_to_continue = 0;
    }

}

$thread1->join();
$thread2->join();

exit 0;

################
sub worker {
        my $num           = shift;
        my $log_file      = shift;

        open ( my $LFH,  '>', $log_file );

        capture_merged {
            inner_worker ( $num );
        } stdout=>$LFH,stderr=>$LFH; 
    
        close  $LFH;
};

sub inner_worker {
        
        my $num = shift;
        print STDOUT "INFO> Some info from sub $num\n";   
        print STDERR "WARN> Some warn from sub $num\n";
        system ("$RealBin/test_err.pl");
        print STDERR "INFO> Some info from sub $num\n";   
        print STDERR "WARN> Some warn from sub $num\n";

};
[download]

test_err.pl (simulating random program) called using system

#!/usr/local/bin/perl
print "non-error\n";
warn "-E- test error";
warn "-E- test error";
warn "-E- test error";
print "non-error\n";
print "non-error\n";
print "non-error\n";
sleep(1);
print "non-error\n";
print "non-error\n";
print "non-error\n";
warn "-E- test error";
warn "-E- test error";
warn "-E- test error";
warn "-E- test error";
warn "-E- test error";
warn "-E- test error";
warn "-E- test error";
warn "-E- test error";
warn "-E- test error";
warn "-E- test error";
warn "-E- test error";
warn "-E- test error";
warn "-E- test error";
warn "-E- test error";
warn "-E- test error";
warn "-E- test error";
warn "-E- test error";
warn "-E- test error";
warn "-E- test error";
warn "-E- test error";
warn "-E- test error";
sleep(1);
warn "-E- test error";
warn "-E- test error";
warn "-E- test error";
warn "-E- test error";
print "non-error\n";
print "non-error\n";
print "non-error\n";
print "non-error\n";
print "non-error\n";
print "non-error\n";
print "non-error\n";
sleep(1);
print "non-error\n";
print "non-error\n";
print "non-error\n";
warn "-E- test error";
warn "-E- test error";
warn "-E- test error";
print "non-error\n";
print "non-error\n";
print "non-error\n";
print "non-error\n";
print "non-error\n";
print "non-error\n";
sleep(1);
print "non-error\n";
sleep(1);
[download]

Comment on capturing output of system call inside a thread Select or Download Code

Replies are listed 'Best First'.
Re: capturing output of system call inside a thread by BrowserUk (Patriarch) on Sep 03, 2015 at 17:54 UTC
It does not surprise me that mixing Capture::Tiny with threading doesn't work well. Indeed, it would surprise me if it did. The standard file handles are process-global entities and thus shared by all the threads in the process. If you want your threads to all write to them concurrently, then you would need to use locking to ensure that the output from different threads didn't become mixed up and corrupted. That's relatively trivial to do: `use threads::shared; my $sem : shared; sub tprint { lock $sem; print @_; } sub twarn { lock $sem; warn @_; }` [download] But attempting to add Capture::Tiny into the mix simply isn't going to work as it uses fork to perform its work, and is completely oblivious to threading. As for an alternative; I cannot work out from your question or your toy code what it is that you're actually trying to achieve. But, if you can describe what it is that you need to do; rather than how you are currently trying to do "it"; then I might be able to help. With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :) In the absence of evidence, opinion is indistinguishable from prejudice. I'm with torvalds on this Agile (and TDD) debunked I told'em LLVM was the way to go. But did they listen!	[reply] [d/l]
Re^2: capturing output of system call inside a thread by that_guy (Novice) on Sep 03, 2015 at 18:22 UTC
Thanks for the reply. I want to have multiple concurrently running "processes". These "processes" are independent of each other. Each "process" can call system, print to STDOUT and STDERR, but I want outputs (both STDOUT and STDERR) and any output from system() of each of these "processes" to be captured in separate files preferably with STDOUT and STDERR outputs arranged properly. This way I would have meaningful log files for each process. Does that make sense?	[reply]
Re^3: capturing output of system call inside a thread by BrowserUk (Patriarch) on Sep 03, 2015 at 19:06 UTC
I want outputs (both STDOUT and STDERR) and any output from system() of each of these "processes" to be captured in separate files preferably with STDOUT and STDERR outputs arranged properly. What do you mean by: "any output from system()"? system doesn't produce output. Programs run using system produce output; but you've already said you want to capture that. So what other output ("produced by system()") are you expecting? "with STDOUT and STDERR outputs arranged properly"? stdout is usually buffered; whereas stderr is usually unbuffered; which means that stuff written to stderr often appears on the console before stuff that was written to stdout chronologically earlier in the execution of the program. There is nothing that can be done from outside of the program to influence the that. If you have access to the sources of the external programs you are running; you can ensure that both streams are unbuffered; but that has nothing to do with threading, or the perl code that is invoking those programs. I think I can suggest something to achieve your goals; but I need you to answer those two questions first. With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday' Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". I knew I was on the right track :) In the absence of evidence, opinion is indistinguishable from prejudice. I'm with torvalds on this Agile (and TDD) debunked I told'em LLVM was the way to go. But did they listen!	[reply]
Re^4: capturing output of system call inside a thread by that_guy (Novice) on Sep 03, 2015 at 19:35 UTC
Re^5: capturing output of system call inside a thread by BrowserUk (Patriarch) on Sep 03, 2015 at 19:43 UTC
Some notes below your chosen depth have not been shown here
Re: capturing output of system call inside a thread by GotToBTru (Prior) on Sep 03, 2015 at 15:25 UTC
System() creates its own fork and does not return any output. That may be why you don't log anything. Dum Spiro Spero	[reply]
Re: capturing output of system call inside a thread by SimonPratt (Friar) on Sep 03, 2015 at 15:40 UTC
You're redirecting STDERR and STDOUT in a multi-threaded environment for logging? That's brave :) You could try `local`ising STDOUT and STDERR in the thread before calling inner_worker: `local STDOUT; local STDERR;` [download] EDIT: Also: `my $thread1 = threads->create ( \&worker, 1, 'sub1.log' ); my $thread2 = threads->create ( \&worker, 2, 'sub2.log' ); my $need_to_continue = 1; while ($need_to_continue) { print "MAIN THREAD IS PRINTING\n"; if ( $thread1->is_running() \|\| $thread2->is_running() ) { sleep 3; } else { $need_to_continue = 0; } } $thread1->join(); $thread2->join(); exit 0;` [download] can be re-written thusly: `my @threads = map {threads->new(\&worker, $_, "sub$_.log")} 1..2; $_->join for @threads; exit 0;` [download] `join` is a blocking process that waits for the thread to complete.	[reply] [d/l] [select]
Re^2: capturing output of system call inside a thread by that_guy (Novice) on Sep 03, 2015 at 15:52 UTC
Thanks for the reply. I'm fully aware that all for this can be rewritten much more concise. This doesn't solve my problem though. I'm also aware that join is blocking which is why I don't want to join the thread while other thread might be running thus providing the parallel execution of all the threads. I've tried local STDOUT and local STDERR call and it gives strange results as well. Looks like after using it, most of the output goes to the screen for whatever reason.	[reply]
Re^3: capturing output of system call inside a thread by SimonPratt (Friar) on Sep 07, 2015 at 11:43 UTC
I see that your original problem has been resolved, but will still comment on a misunderstanding you appear to have about threading. When I say that join is blocking, it blocks only the thread that is calling the join. Because threads are run in parallel and are essentially independent of each other, calling join in one thread has no effect on the processing of any other threads. In fact, because join is blocking, it is preferable for the parent thread to call join on at least one sub-thread as soon as possible, to minimise the resources the parent thread is consuming (and hence making those additional resources available for the sub threads). The main point about writing code more concisely is that it makes your program easier to understand and follow. Doing something in 40 lines that can be easily written and understood in 2 lines makes life a lot harder for anyone trying to understand what you're doing. (Do keep in mind the distinction between minimising code and obfuscating code).	[reply]
Re: capturing output of system call inside a thread by RichardK (Parson) on Sep 03, 2015 at 16:10 UTC
What output is jumbled up? Did you mean stdout & stderr ? If so, then the docs for Capture::Tiny say this :- `capture_merged ... Caution: STDOUT and STDERR output in the merged result are not guarant +eed to be properly ordered due to buffering.` [download]	[reply] [d/l]
Re^2: capturing output of system call inside a thread by that_guy (Novice) on Sep 03, 2015 at 16:14 UTC
Sorry, I wasn't being specific. Output that belongs in Main thread (e.g. supposed to be visible on the screen) ended up in one of the log files, output printed in one of the worker sub ended up on the screen, and output that was supposed to be in sub1.log ended up in sub2.log	[reply]