neversaint has asked for the wisdom of the Perl Monks concerning the following question:

Dear Masters,

I have a code that takes values from the initially generated hash and array from some functions. In the un-threaded situation my code looks like this:
use strict; use warnings; my @array1 = funct_array($par1,$par2,$filename); my %hash1 = funct_hash($par1,$par2,$par3,$filename); # there are more tables/lists like this # then followed process that # takes the pre-generated value from @array1 and %hash1. # Thus both @array1 and %hash1 must be completed first. # Subroutine list sub funct_array { my ($p1,$p2,$fn) = @_; my @array_result; # Run some time consuming process return @array_resutlt; } sub funct_hash{ my ($p1,$p2,$fn) = @_; my %hash_result; # Run some time consuming process return %hash_resutlt; }

Then to speedup the process, I am trying to use thread. Here is what I did:
use strict; use warnings; use threads; use Data::Dumper; my $array1 = threads->new(\&funct_array,($par1,$par2,$filename)); my $hash1 = threads->new(\&funct_hash,($par1,$par2,$par3,$filename)); # Then I tried to see the content of the array # with Data Dumper print Dumper $array->join; print Dumper $hash1->join; # But I couldn't see any value of @array1 and # %hash1 that I can use for the later process # Subroutine list sub funct_array { my ($p1,$p2,$fn) = @_; my @array_result; # Run some time consuming process return @array_resutlt; } sub funct_hash{ my ($p1,$p2,$fn) = @_; my %hash_result; # Run some time consuming process return \%hash_resutlt; }
As I stated in my snippet above, I couldn't recover back the returned value of array and hash after threading. What's wrong with my code?

---
neversaint and everlastingly indebted.......

Replies are listed 'Best First'.
Re: Howto capture array/hash back after threaded process
by BrowserUk (Patriarch) on Mar 02, 2006 at 12:16 UTC

    Here's one, fairly simple way to do what you've requested.

    use strict; use warnings; use threads; sub getArray { my( $arg1, $arg2, $fn ) = @_; my @array = 'a' .. 'z'; return @array; } sub getHash { my( $arg1, $arg2, $arg3, $fn ) = @_; my( $key, $value ) = ( 'A', 1 ); return map{ $key++ => $value++ } 1 .. 10; } my( $arg1, $arg2, $arg3, $filename ) = @ARGV; ### Note: The function that is run in the thread ### is called in the same context as the call to create. ### If you want the function called in a llist context, ### you must call create in a list context. my( $thrArray ) = threads->create( \&getArray, $arg1, $arg2, $filename + ); my( $thrHash ) = threads->create( \&getHash, $arg1, $arg2, $arg3, $fil +ename ); ### Do anything else the main thread needs to do here, *before* callin +g join. ### The joins will block until their threads complete. ### The values returned by the functions are returned when you join th +e thread. my @array = $thrArray->join; my %hash = $thrHash->join; print "@array\n"; print map( "$_:$hash{ $_ }, ", sort keys %hash ), $/; __END__ C:\test>533816 One Two Three apath\to\afile a b c d e f g h i j k l m n o p q r s t u v w x y z A:1, B:2, C:3, D:4, E:5, F:6, G:7, H:8, I:9, J:10,

    This is the very simplest use of threads that avoids the need for shared data completely. Unless your subroutines do IO, or you are using a multi-cpu machine, this will not speed up your program.

    On a single cpu machine, if they do both do IO (to or from different files!), it may speed it up somewhat, depending upon the performance of your filesystem and hardware and the proportion of time the code spends waiting for IO relative to do cpu-intensive code, but it may not.

    On a multi-cpu machine, it should run more quickly.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      On a multi-cpu machine, it should run more quickly.
      Dear BrowserUk,
      Thanks so much for the reply. I manage to get it working. Few question however, to clarify my understanding of your statement above.
      • Those number of CPU we have is the one shown through this command right?
        $ cat /proc/cpuinfo
      • What happen if the number of process to be threaded is greater than the number of CPU we have? Will they be evenly distributed? Still can we benefit from the speedup?


      ---
      neversaint and everlastingly indebted.......
        Those number of CPU we have is the one shown through this command right? $ cat /proc/cpuinfo

        I don't use *nix, so you'll need get a response from one of the many here that do.

        What happen if the number of process to be threaded is greater than the number of CPU we have?

        If you have (say) 2 cpus and 3 cpu-bound functions running in threads, then they will complete more quickly, than running the same 3 functions one after the other as a single threaded process.

        It may be quicker overall to only run two threads as the same time to avoid excessive task switching, but that it a fine tuning thing best left for when you see what performance you achieve doing the simple thing.

        Will they be evenly distributed?

        That is actually quite hard to answer accurately without knowing what the processes are doing; what else is running on the system; etc., but in most cases the answer will be: Yes.

        Still can we benefit from the speedup?

        Again, if you described what these threads will be doing you will get a more accurate and definitive answer, but under most OSs and circumstances: Yes.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Howto capture array/hash back after threaded process
by zer (Deacon) on Mar 02, 2006 at 08:26 UTC
    maybe this will work

    push @array1, new threads \&funct_array,($par1,$par2,$filename);
    if all fails there is the threads::shared to set up a cheaters way of storing values
      Hi zer, Thanks for the reply, but it doesn't work. I've also tried this variation:
      my @array1; push @array1,threads->new(\&funct_array,($par1,$par2,$filename)); print Dumper \@array1;
      Also doesn't work. Can you give example of how to use threads::shared? I've read the documentation but couldn't figure out how to take the subroutine in the context.

      ---
      neversaint and everlastingly indebted.......
Re: Howto capture array/hash back after threaded process
by zer (Deacon) on Mar 02, 2006 at 09:27 UTC
    what is the error you get from the original code?
      Dear zer,
      With my example of my second posting:
      $VAR1 = [ bless( do{\(my $o = '168888848')}, 'threads' ) ]; A thread exited while 2 threads were running.


      ---
      neversaint and everlastingly indebted.......
        #!/usr/bin/perl -w BEGIN{ use Config; die "Threadbare\n" unless $Config{'useithreads'}; } use strict; use warnings; use threads; use Data::Dumper; sub funct_array { my $self = threads->self; my ($p1,$p2,$fn) = @_; my @array_result="12345"; # Run some time consuming process return @array_result; } sub funct_hash{ my $self = threads->self; my ($p1,$p2,$fn) = @_; my %hash_result; # Run some time consuming process return \%hash_result; } my ($par1,$par2,$par3,$filename); $par1=1; $par2=2; $par3=4; $filename=3; my $array1 = new threads (\&funct_array,($par1,$par2,$par3,$filename)) +; my $hash1 = threads->new(\&funct_hash,($par1,$par2,$par3,$filename)); print "Sleep"; sleep 1; print "\nwakeup $$array1 $$hash1"; # Then I tried to see the content of the array # with Data Dumper print Dumper $array1->join; print Dumper $hash1->join;

        This returns the referances to the functions. Your functions dont return any value. This will find something with the dumper mod. This seems to be on the right track for now ill see what else i can find. There is something involving the waitpid

        that means the thread id you are calling hasnt completed processing.

        The join sub is the waitpid for threads. Lets you know when it is finished processing.

Re: Howto capture array/hash back after threaded process
by zentara (Cardinal) on Mar 02, 2006 at 11:40 UTC
    Just as a general purpose observation, using sleep 1, before joining the threads could be a problem is your thread takes longer than a second to complete. What I would do is make a shared-variable initially set to 0, which the thread sets to 1, when done(at end of it's code block). Then have a loop in main, which monitors that shared var, and joins the thread once it detects it's a 1.

    I'm not really a human, but I play one on earth. flash japh
Re: Howto capture array/hash back after threaded process
by Anonymous Monk on Mar 02, 2006 at 09:09 UTC
    It doesn't compile
    Global symbol "$par1" requires explicit package name at Global symbol "$par2" requires explicit package name at Global symbol "$filename" requires explicit package name Global symbol "$par1" requires explicit package name at Global symbol "$par2" requires explicit package name at Global symbol "$par3" requires explicit package name at Global symbol "$filename" requires explicit package name Global symbol "$array" requires explicit package name at Global symbol "@array_resutlt" requires explicit package Global symbol "%hash_resutlt" requires explicit package