Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

How to make a thread to wait till a command on shell get completed.

by techman2006 (Beadle)
on Dec 02, 2013 at 06:53 UTC ( [id://1065232]=perlquestion: print w/replies, xml ) Need Help??

techman2006 has asked for the wisdom of the Perl Monks concerning the following question:

I need to make a thread to wait till the command run on shell get completed.

The use case is given below

  1. Pick one archive from the queue.
  2. Now untar that archive in a directory. This operation is fired using bac kticks operator.
  3. Repeat above steps till the queue is not empty.

Now above operations will be performed by a set of threads. So the problem is that thread don't till the execution over the shell get completed and they are overloading the system.

So is there a way I can make thread to wait till the execution over the shell get completed.

  • Comment on How to make a thread to wait till a command on shell get completed.

Replies are listed 'Best First'.
Re: How to make a thread to wait till a command on shell get completed.
by BrowserUk (Patriarch) on Dec 02, 2013 at 07:02 UTC

    If you run a command (shell or otherwise) using system, the program will block until system returns.

    If the program calling system is a single threaded program, that single thread will block.

    If the program is a multi-threaded program, whichever thread calls system will block.

    Now what is your problem?


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      Basically I am using back tick operator due to which multiple instance of gzip get triggered as the thread is not waiting for the shell to return.

      So how I can achieve blocking of thread in case of back tick.

        Back ticks also wait for the command to finish. Do you use &? Show some code!
        لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
        Basically I am using back tick operator due to which multiple instance of gzip get triggered as the thread is not waiting for the shell to return.

        Backticks also block. (Unless you are deliberately backgrounding the command, in which case, don't do that!)

        Post your failing code and you'll get help.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

        Can you share example code that recreates the issue that you're seeing? Also, do you have a need for using back ticks?

        In the mean time, without seeing your code, I can think of two suggestions. First, use the system command instead of backticks like BrowserUk suggested. Second, try using modules like Archive::Tar to extract your tar files instead of using backticks.

Re: How to make a thread to wait till a command on shell get completed.
by Anonymous Monk on Dec 02, 2013 at 11:34 UTC
    Assuming your code is something like this:
    sub thr { my $file = shift; system(qw/tar xf/, $file); } for my $file (@files) { launch_thread(\&thr, $file); }
    you can solve the problem by wrapping the system call inside a mutex or a semaphore. I'm afraid I don't know how to do that -- someone who knows Perl threads will have to tell you how.

      Below is the code which I was talking about.

      use strict; use warnings; use File::Find; use Time::localtime; use File::Copy 'cp'; use File::Copy 'mv'; use File::Path qw(make_path); use threads; use Thread::Queue; # Package for debugging need to remove use diagnostics; use Data::Dumper; sub CreateIndividualArchive { my $srcDir = shift; my $destDir = shift; my $arrayFileList = shift; my $pathDelimiter = "/"; #my $tempFile = "dump.txt"; my @chars = ( '0' .. '9', 'A' .. 'F' ); my $len = 8; my $string; while ( $len-- ) { $string .= $chars[ rand @chars ] } my $tempFile = $string; #print "Temp fils is $tempFile\n"; my $fh; my $fileName = $destDir . $pathDelimiter . $tempFile; open $fh, '>', $fileName or die "Cannot open $tempFile :$!"; my $dirName = $arrayFileList->[0]; chop($dirName); for ( @$arrayFileList[ 1 .. $#$arrayFileList ] ) { print $fh "$_\n"; } close($fh); my $tarFileList = $destDir . $pathDelimiter . $tempFile; my $tarExt = ".tar.gz"; my $tarFileName = $destDir . $pathDelimiter . $dirName . $tarExt; my $cmd = "tar -zcf $tarFileName -C $srcDir -T $tarFileLis +t"; #print "Cmd is $cmd\n"; print "CMD = $cmd\n"; my @tarOutput = `tar -zcf $tarFileName -C $srcDir -T $tarFileList +2>&1`; if ( $? == -1 ) { print "Archiving of the files fails : $!\n"; unlink $tarFileName; return 0; } unlink $tarFileList; return 1; } sub Thread { my $hashParm = shift; my %hashFileList = %$hashParm; my $sourcePath = shift; my $destinationPath = shift; my $keys; my $values; my @arrayValues; my @rowData = (); my $totalKey = keys %hashFileList; my $pathDelimiter = "/"; #print "In thread and total keys received $totalKey\n"; while ( ( $keys, $values ) = each(%hashFileList) ) { push( @arrayValues, $keys . $pathDelimiter ); my @row = ( $keys . $pathDelimiter, $keys . $pathDelimiter, 0 +); push( @rowData, \@row ); my @arrayParm = @{$values}; foreach my $value (@arrayParm) { my $fileName = $$value[0]; my $fileSize = $$value[1]; push( @arrayValues, $fileName ); my @row = ( $keys . $pathDelimiter, $fileName, $fileSize ) +; push( @rowData, \@row ); } #print "SourcePath $sourcePath and dest $destinationPath\n"; my $error = CreateIndividualArchive( $sourcePath, $destination +Path, \@arrayValues ); if ( $error eq 0 ) { print "Error while doing tar is $error\n"; } @arrayValues = (); @rowData = (); } } sub ScanDirWithPattern { my $sourcePath = shift; my $hashFileList = shift; my $pathDelimiter = "/"; my $pattern = ".txt"; if ( 0 eq opendir( DIR, $sourcePath ) ) { print "Failed to open directory $sourcePath\n"; return 0; } my @dirList; if ( 0 eq ( @dirList = readdir(DIR) ) ) { print "Failed to read directory $sourcePath\n"; closedir(DIR); return 0; } closedir(DIR); foreach my $dir (@dirList) { #print "Current directory is $dir\n"; next if ( $dir eq "." or $dir eq ".." ); my $currentDir = $sourcePath . $pathDelimiter . $dir; if ( -d $currentDir ) { if ( 0 eq opendir( DIR, $currentDir ) ) { print "Failed to open directory $currentDir\n"; return 0; } my @fileList; if ( 0 eq ( @fileList = readdir(DIR) ) ) { print "Failed to read directory $dir\n"; closedir(DIR); return 0; } closedir(DIR); my @relativeFileArray; foreach my $file (@fileList) { next if ( $file eq "." or $file eq ".." ); my $currentFile = $sourcePath . $pathDelimiter . $dir . $pathDelimiter + . $file; next if ( -d $currentFile ); if ( -f $currentFile ) { if ( $currentFile =~ /$pattern/i ) { my $relativeFile = $dir . $pathDelimiter . + $file; my $size = -s $currentFile; my @currentFileArray = ( $relativeFile, $size +); print "Inserting the $relativeFile in arra +y\n"; push( @relativeFileArray, \@currentFileArray ) +; } } } $hashFileList->{$dir} = \@relativeFileArray; } } return 1; } sub Create { my ( $sourcePath, $destinationPath ) = @_; my $pathDelimiter = "/"; my %hashFileList; my $folderName = "temp"; my $error = ScanDirWithPattern( $sourcePath, \%hashFileList ); if ( $error eq 0 ) { print "Error while scaning $sourcePath for files\n"; return 0; } my $keys; my $values; my @arrayValues; my @rowData = (); my $totalKeys = keys %hashFileList; my $numThreads = 5; #For the time being if ( $totalKeys le $numThreads ) { $numThreads = $totalKeys; } my $bucketSize = $totalKeys / $numThreads; my @keys = keys %hashFileList; my @arrThreads; my $i = 0; my @arrHash; my $tempDir = $destinationPath . $pathDelimiter . $folderName; make_path($tempDir); $destinationPath = $tempDir; while ( my @keys2 = splice @keys, 0, $bucketSize ) { my %hash1; @hash1{@keys2} = @hashFileList{@keys2}; push @arrHash, \%hash1; } for my $href (@arrHash) { my $t = threads->create( \&Thread, \%$href, $sourcePath, $destinationPath ); push( @arrThreads, $t ); } foreach (@arrThreads) { my $num = $_->join; #print "done with $num\n"; } } my $srcDir = ""; my $destDir = ""; if ( @ARGV < 2 ) { die "$0 - Need source and destination directory\n" . "Usage: perl $0 src dest\n"; } $srcDir = shift; $destDir = shift; Create($srcDir, $destDir);

      Now when I run above code and check how many instance of either tar or gzip is running I get below output.

      [root@localhost trunk]# ps -eaf | grep "tar -zcf" root 1962 28983 0 19:54 pts/0 00:00:00 tar -zcf /root/dump//t +emp/Test2444.tar.gz -C /root/tests -T /root/dump//temp/48DEB775 root 1987 28983 0 19:54 pts/0 00:00:00 tar -zcf /root/dump//t +emp/Test1585.tar.gz -C /root/tests -T /root/dump//temp/77415208 root 1994 28983 0 19:54 pts/0 00:00:00 tar -zcf /root/dump//t +emp/Test1106.tar.gz -C /root/tests -T /root/dump//temp/BFA8D1F4 root 1998 28983 0 19:54 pts/0 00:00:00 tar -zcf /root/dump//t +emp/Test636.tar.gz -C /root/tests -T /root/dump//temp/8BED4FA8 root 2016 28983 0 19:54 pts/0 00:00:00 tar -zcf /root/dump//t +emp/Test273.tar.gz -C /root/tests -T /root/dump//temp/C228C9E6 root 2021 28983 0 19:54 pts/0 00:00:00 tar -zcf /root/dump//t +emp/Test2573.tar.gz -C /root/tests -T /root/dump//temp/044B2F61 root 2149 28983 0 19:54 pts/0 00:00:00 tar -zcf /root/dump//t +emp/Test2563.tar.gz -C /root/tests -T /root/dump//temp/9657C48F root 2150 28983 0 19:54 pts/0 00:00:00 tar -zcf /root/dump//t +emp/Test1553.tar.gz -C /root/tests -T /root/dump//temp/71BE66D1 root 2152 28983 0 19:54 pts/0 00:00:00 tar -zcf /root/dump//t +emp/Test1726.tar.gz -C /root/tests -T /root/dump//temp/1B2D081F root 2200 28983 0 19:54 pts/0 00:00:00 tar -zcf /root/dump//t +emp/Test493.tar.gz -C /root/tests -T /root/dump//temp/8932236E root 2201 28983 0 19:54 pts/0 00:00:00 tar -zcf /root/dump//t +emp/Test2274.tar.gz -C /root/tests -T /root/dump//temp/F42D8053 root 2206 25225 0 19:54 pts/1 00:00:00 grep tar -zcf [root@localhost trunk]# ps -eaf | grep "tar -zcf" root 1994 28983 0 19:54 pts/0 00:00:00 tar -zcf /root/dump//t +emp/Test1106.tar.gz -C /root/tests -T /root/dump//temp/BFA8D1F4 root 1998 28983 0 19:54 pts/0 00:00:00 tar -zcf /root/dump//t +emp/Test636.tar.gz -C /root/tests -T /root/dump//temp/8BED4FA8 root 2021 28983 0 19:54 pts/0 00:00:00 tar -zcf /root/dump//t +emp/Test2573.tar.gz -C /root/tests -T /root/dump//temp/044B2F61 root 2149 28983 0 19:54 pts/0 00:00:00 tar -zcf /root/dump//t +emp/Test2563.tar.gz -C /root/tests -T /root/dump//temp/9657C48F root 2150 28983 0 19:54 pts/0 00:00:00 tar -zcf /root/dump//t +emp/Test1553.tar.gz -C /root/tests -T /root/dump//temp/71BE66D1 root 2152 28983 0 19:54 pts/0 00:00:00 tar -zcf /root/dump//t +emp/Test1726.tar.gz -C /root/tests -T /root/dump//temp/1B2D081F root 2200 28983 0 19:54 pts/0 00:00:00 tar -zcf /root/dump//t +emp/Test493.tar.gz -C /root/tests -T /root/dump//temp/8932236E root 2201 28983 0 19:54 pts/0 00:00:00 tar -zcf /root/dump//t +emp/Test2274.tar.gz -C /root/tests -T /root/dump//temp/F42D8053 root 2300 28983 0 19:54 pts/0 00:00:00 tar -zcf /root/dump//t +emp/Test1508.tar.gz -C /root/tests -T /root/dump//temp/573093A4 root 2301 28983 0 19:54 pts/0 00:00:00 tar -zcf /root/dump//t +emp/Test431.tar.gz -C /root/tests -T /root/dump//temp/1A75C3EF root 2353 28983 0 19:54 pts/0 00:00:00 tar -zcf /root/dump//t +emp/Test1088.tar.gz -C /root/tests -T /root/dump//temp/02CA6015 root 2368 25225 0 19:54 pts/1 00:00:00 grep tar -zcf ^[[A[root@localhost trunk]# ps -eaf | grep "gzip" root 2208 2200 2 19:54 pts/0 00:00:01 gzip root 2209 2149 1 19:54 pts/0 00:00:00 gzip root 2210 2150 1 19:54 pts/0 00:00:01 gzip root 2302 2301 0 19:54 pts/0 00:00:00 gzip root 2303 2300 1 19:54 pts/0 00:00:00 gzip root 2371 2353 3 19:54 pts/0 00:00:01 gzip root 2384 2377 0 19:55 pts/0 00:00:00 gzip root 2387 2386 0 19:55 pts/0 00:00:00 gzip root 2499 2389 2 19:55 pts/0 00:00:00 gzip root 2581 2509 0 19:55 pts/0 00:00:00 gzip root 2663 2583 4 19:55 pts/0 00:00:00 gzip root 2691 25225 0 19:55 pts/1 00:00:00 grep gzip root 2700 2665 0 19:55 pts/0 00:00:00 gzip

      Now as I have only 5 thread created I expect to have only 5 instance of tar and similarly 5 instance of gzip. But that doesn't seems to be the case.

      Any thoughts to fix this issue.

      Please note that the actual data is not text files so the archiving will take some time to complete the operation.

        Now as I have only 5 thread created I expect to have only 5 instance of tar and similarly 5 instance of gzip. But that doesn't seems to be the case.

        You say that, and indeed you do start out with my $numThreads = 5;; but then, you immediately override that:

        if ( $totalKeys le $numThreads ) { $numThreads = $totalKeys; }

        And then, in the loop where you create your threads you don't seem to use that variable at all:

        for my $href (@arrHash) { my $t = threads->create( \&Thread, \%$href, $sourcePath, $destinationPath ); push( @arrThreads, $t ); }

        Your has nothing to do with backticks not blocking, but is simply a programming error in that you are just starting lots and lots of threads without (it appears) any mechanism to control how many you start.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
        You're loading Thread::Queue but not using it. You can much simplify your thread-creation logic with it:
        sub Create { my ( $sourcePath, $destinationPath ) = @_; # ... create directories and stuff ... my $tq = Thread::Queue->new(); for (1..5) { push @arrThreads, threads->create( sub { while (defined(my $item = $q->dequeue())) { DoSomethingWithItem($item, $sourcePath, $destinati +onPath); } } ); } while (my ($k, $v) = each(%hashFileList)) { $tq->enqueue( $v ); } $tq->end(); foreach (@arrThreads) { my $num = $_->join; } }

        The major change here is that it's passing the files one item at a time to the threads. You're dividing them to n buckets and passing the whole bucket to the thread at creation time.

        Apart from being nicer to look at, this one-at-a-time (supervisor-worker) approach should ensure that the threads will finish quicker since they're given equal amounts of work.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1065232]
Approved by kcott
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (6)
As of 2024-04-19 14:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found