Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Do I need to use Coro instead of threads/forks

by mohan2monks (Beadle)
on Sep 29, 2014 at 10:37 UTC ( [id://1102336]=perlquestion: print w/replies, xml ) Need Help??

mohan2monks has asked for the wisdom of the Perl Monks concerning the following question:

Hi monks i am here again to seek your wisdom..

I have a cgi script (which i am trying to turn into psgi app Change cgi script to PSGI app) which has to connect to different vendor api's and collect data and return in as a json/xml output

I am trying to fetch this data simultaneously from each vendor by creating threads for each vendor.
I need to create max 10 threads.
But now vendor api has changed and i will have to make additional api calls in each thread.
Example Call a vendor api get list of products then fetch details of each over a api call with product id.
If i start threads again in a thread already it crashes and memory increases too, plus i need to load too much modules (may be i don't know how to do it correctly)

I have been searching perlmomks and net for probable alternatives and found that as my script usually wastes lot of time in waiting for SOAP::Lites calls i may be better off with using Coro instead of threads. (one of the reference Why Coro?)
Again by reading on perlmonks i changed to using forks/forks::BerkeleyDB (forgive me i don't have reference to this now).
It worked fine till it was not needed to use threads in already running threads. If i fork again foreach product apache has too many children.
I actually tried it on a test system and my admin is not nice to me since :(

Can i safely use Coro to do this?
I am a bit confused about how to explicitly use Coro's cede function.(where to put it exactly inside threads) If i cede it inside a thread will that thread run again?
Will all my vendor functions that i start in main thread run in parallel?
I have read http://http://search.cpan.org/~mlehmann/Coro-6.41/Coro.pm this in details but i am confused.

Please enlighten me..

#!/usr/bin/perl #use forks::BerkeleyDB; #use threads; use strict; use Coro; use CGI qw(:standard); use XML::Simple; use JSON qw(encode_json); use MIME::Base64 qw(encode_base64); use DBI; use SOAP::Lite; use Data::Dumper; use CHI; my (@req,@odata,@errdet)=(); my %threads=(); foreach my $vnd (@req){ #$threads{$vnd}=threads->create({ 'context' => 'list', 'exit' +=> 'thread_only' },\&startthread,$vnd,$SCH); $threads{$vnd}=async{&startthread($vnd,$SCH) }; } cede; foreach my $vnd (keys %threads) { my($err,$data) =$threads{$vnd}->join(); if ($err eq 'N') { push @odata,@{$data}; } }elseif($err eq 'Y'){ push @errdet,$data; }elsif($threads{$vnd}->error) # with threads it was useful, { push @errdet,{ Vnd=>$vnd, ErrorCode=>26,ErrorMsg=>$threads +{$vnd}->error}; } } sub startthread { my ($vendor,$SCH)=@_; my($err,$data) =(); require './'.$vendor.'functions.pl"; #load vendor specific code ($err,$data)=&getvendordata($SCH); return($err,$data); } #vendor functions has somewhat code like this besides functions specif +ic to each vendor api sub getvendordata { my $SCH=shift; #perform vendor specific tasks mostly SOAP::Lite calls to api #Has to perform multiple calls some of them depend on output of pr +evious calls #Also have to make simultaneous calls which may not be related to +each other can be done parallelly. #e.g. first call to api returns a list of products subsequent call +s to fetch details. #If i start threads/forks again here most of the time program cras +hes :( my ($err,@data)=(); my @products=getproduects(); #Maybe i can start coro routines here to get all listed detail +s simultaneously foreach my $product (@products) { #fetch product details over api push @data,$product->getdetails; } return ($err,\@data); }

Replies are listed 'Best First'.
Re: Do I need to use Coro instead of threads/forks
by Corion (Patriarch) on Sep 29, 2014 at 11:46 UTC

    My advice is to not deal with Coro::cede but to use a Coro-aware LWP::UserAgent implementation, for example Coro::LWP. Then you can just use async { ... } to launch your processing in a non-blocking fashion.

    I don't understand why you want to use Coro (or forks::BerkeleyDB) - maybe you want to communicate the data you fetch via a database or files instead. That would make the communication much easier, see perlipc.

      Thanks for your help corion
      I am using SOAP::Lite to make calls to vendor api's which is internally using LWP.
      I tried to override that but maybe done something wrong, if use Coro::LWP will it over ride that behavior, i went looking into source of LWP and it requires LWP::Useragent and stuff.

      Your other suggestion, if i understand correctly is to fetch data store it in DB or files and read from there to serve as json.
      The example that i gave of product data.
      Basically this script is used to search inventory based on user search and results must be fetched from api's directly every time by my script add some of our data and return to user.
      As this is inventory like that of airline seats,hotel rooms etc which is frequently changing. I cannot make cache of this information (cache misses are higher than cache hits used CHI driver BerkeleyDB) further the search combinations could be huge to fetch and write to data.
      Biggest problem is main threads must wait till all threads return data, hence cannot use detach, detached thread writes data to file/db, main thread polls for write etc.

      Any other approach i can follow?

        Biggest problem is main threads must wait till all threads return data,

        Why is that a problem?

        As I understand it, you have a cgi that accepts some user search terms. Once those search terms are returned to you, you then want to forward those terms to several vendor sites, aggregate the information they return to you, and then present the aggregation back to the user. (Is that correct? )

        You cannot present the aggregation until you have all the data; so why is it a problem to wait for the threads to complete?

        Unless you are hoping to present the data back to the user piecemeal, as you receive it?

        In which case: set up a queue; detach the threads and have them post the data they receive to that queue. Then you main thread does not have to wait for all the threads to complete; it simply monitors the queue and deals with the data as it arrives.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

        From what I understand of your problem, using Coro should work. Basically, your program should be something like your threads example, except that you use Coro::LWP and you use Coro instead of OS threads.

        What have you tried so far that gives you problems?

        Maybe you want to start with a small example program that does not talk to your vendor but tries to make several parallel requests to Google?

Re: Do I need to use Coro instead of threads/forks (reposts)
by LanX (Saint) on Sep 29, 2014 at 12:26 UTC
    Please stop reposting the same question over and over again. :)

    Cheers Rolf

    (addicted to the Perl Programming Language and ☆☆☆☆ :)

      Sorry!
      Must have posted by mistake

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1102336]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (5)
As of 2024-04-24 19:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found