Parallel Search using Thread::Pool

shanu_040 has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Parallel Search using Thread::Pool by BrowserUk (Patriarch) on Mar 17, 2009 at 09:55 UTC
The simplest architecture is to create a Thread::Queue, then have each of your search modules run in separate threads and enqueue their results as they get them. Your main thread can the read them off the other of of that shared queue and display them. Thread::Pool will not be useful to you as it is meant to run many copies of the same routine concurrently, but your application calls for running different subroutines in each of your threads. Depending whether your application is web, gui or console based, you might also want to use a second queue or shared scalar to pass new search terms to your threads. Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. "Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."	[reply]
Re^2: Parallel Search using Thread::Pool by shanu_040 (Sexton) on Jun 01, 2009 at 09:34 UTC
Could you please help me to develop this, I have tried the Thread::Queue. It takes too much time to retrieved the result from a source and I have to search more than 50 sources at a time. There could be more than 10 instances that would be running concurrently. How can main thread can read them off and display> following is the code I am using sub run_search { my ($self, $searches, $search_string, $site, $max_hits, $from_year +, $to_year) = @_; my $Qwork = new Thread::Queue; my $Qresults = new Thread::Queue; my $THREADS = scalar(keys %$searches); my @return; foreach my $obj (values %$searches) { eval { $obj->from_year($from_year); $obj->to_year($to_year); $obj->parse_search(); }; if ($@) { print STDERR "problem2 $@\n"; } $Qwork->enqueue($obj); } $Qwork->enqueue( (undef) x $THREADS ); my @pool = map{ threads->create( \&parallel_search, $Qwork, $Qresults, $max_hi +ts, $self->nuc_code) }1 .. $THREADS; for(1..$THREADS){ while( my $result = $Qresults->dequeue ){ push(@return, $result); } } ## Clean up the threads $_->join for @pool; return(\@return); } # # # the parallel server # # sub parallel_search { my ($Qwork, $Qresults, $max_hits, $nuc_code) = @_; my $tid = threads->tid; my %result; while(my $work = $Qwork->dequeue) { 'require ' . ref($work) . ';'; $work->max_hits($max_hits); $result{$work->resource_id} = $work->get_search_results($work- +>resource_id, $nuc_code); $Qresults->enqueue( \%result ); } $Qresults->enqueue( undef ); } [download]	[reply] [d/l]
Re^3: Parallel Search using Thread::Pool by BrowserUk (Patriarch) on Jun 01, 2009 at 13:41 UTC
It takes too much time to retrieved the result from a source How do you know it is taking too long? How long is too long? How are you measuring it? I'll try to help, but you are going to have to explain what you are doing a lot more clearly that you have to date. Are you trying to display the results on a web page as you get them? If so, that could be the source of your problems. Whilst not impossible, it is quite difficult to render web pages on-the-fly because HTML simply wasn't designed to work that way. If not, then you are going to have to describe or post the overall operation of the application, rather than just keep posting the same basic snippet. What type of application is it? GUI; CLI, web app. What are you searching? DBs, web pages; other? You mention 50 searches and the possibility of 10 concurrent instances. Does each instance search all 50 sources? Are the all searching the same sources? I have looked at your earlier posts but as I do not understand what you are trying to achieve, it's hard to begin to help you. I don't the specific details of the data, but a clear overview of the dataflows is essential. Also, how long is it takling currently, and what is your target? Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error. "Science is about questioning the status quo. Questioning authority". In the absence of evidence, opinion is indistinguishable from prejudice. "Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."	[reply]
Re^4: Parallel Search using Thread::Pool by shanu_040 (Sexton) on Jun 02, 2009 at 04:56 UTC
Re^5: Parallel Search using Thread::Pool by BrowserUk (Patriarch) on Jun 02, 2009 at 10:59 UTC
Some notes below your chosen depth have not been shown here
Re^2: Parallel Search using Thread::Pool by shanu_040 (Sexton) on Mar 17, 2009 at 11:22 UTC
Thanks, my application is web based. I have cerated different perl module(.pm) for each search source(site). Currently I am using Parallel::ForkManger. But the problem I am facing is, it waits for all the children to finish their task and then only I can display the result. I stuck on how to display the results as results get retrieved by a child and subsequently adding other children results for display. What will be the algorithm for the problem? thanks Shanu	[reply]
Re^3: Parallel Search using Thread::Pool by Anonymous Monk on Mar 17, 2009 at 12:00 UTC
Watching long processes through CGI (Aug 02)	[reply]
Re^3: Parallel Search using Thread::Pool by shanu_040 (Sexton) on May 22, 2009 at 18:42 UTC
Hi monks, I am still waiting to get some kind of solution from your side.	[reply]