Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re^2: Do I need to use Coro instead of threads/forks

by mohan2monks (Beadle)
on Sep 29, 2014 at 12:49 UTC ( [id://1102347]=note: print w/replies, xml ) Need Help??


in reply to Re: Do I need to use Coro instead of threads/forks
in thread Do I need to use Coro instead of threads/forks

Thanks for your help corion
I am using SOAP::Lite to make calls to vendor api's which is internally using LWP.
I tried to override that but maybe done something wrong, if use Coro::LWP will it over ride that behavior, i went looking into source of LWP and it requires LWP::Useragent and stuff.

Your other suggestion, if i understand correctly is to fetch data store it in DB or files and read from there to serve as json.
The example that i gave of product data.
Basically this script is used to search inventory based on user search and results must be fetched from api's directly every time by my script add some of our data and return to user.
As this is inventory like that of airline seats,hotel rooms etc which is frequently changing. I cannot make cache of this information (cache misses are higher than cache hits used CHI driver BerkeleyDB) further the search combinations could be huge to fetch and write to data.
Biggest problem is main threads must wait till all threads return data, hence cannot use detach, detached thread writes data to file/db, main thread polls for write etc.

Any other approach i can follow?

  • Comment on Re^2: Do I need to use Coro instead of threads/forks

Replies are listed 'Best First'.
Re^3: Do I need to use Coro instead of threads/forks
by BrowserUk (Patriarch) on Sep 29, 2014 at 13:13 UTC
    Biggest problem is main threads must wait till all threads return data,

    Why is that a problem?

    As I understand it, you have a cgi that accepts some user search terms. Once those search terms are returned to you, you then want to forward those terms to several vendor sites, aggregate the information they return to you, and then present the aggregation back to the user. (Is that correct? )

    You cannot present the aggregation until you have all the data; so why is it a problem to wait for the threads to complete?

    Unless you are hoping to present the data back to the user piecemeal, as you receive it?

    In which case: set up a queue; detach the threads and have them post the data they receive to that queue. Then you main thread does not have to wait for all the threads to complete; it simply monitors the queue and deals with the data as it arrives.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      Yes BrowserUK your understanding is correct.
      What i meant with main program must wait was exactly in the sense that i must aggregate content and hence cannot detach threads. (i never thought of queuing thinking it will be hard to manage)
      Or can i try one of your excellent suggestion here Re: Why Coro?. using OS threads for each vendor and then use coros inside them to fulfill new requirement.
      May be that should give me best of both.
      Please correct me if i am wrong.

        Or can i try one of your excellent suggestion here Re: Why Coro?. using OS threads for each vendor and then use coros inside them to fulfill new requirement.

        For just 10 threads, mixing Coro/threads would be overkill. (It has also never been confirmed to me that Coro is thread safe.) Personally, I'd stick to the known simplicity of threads. Also, if you go the mixed architecture route, you may find yourself out in the cold as far as support is concerned. I'm pretty sure that the Coro author doesn't do threads; and I don't do Coro; so you'd be on your own if things go tits-up.

        Also, unless your vendors are unusually tolerant; if you going hitting their websites with multiple concurrent requests, you are likely to get your accounts suspended. You may even find that hitting them with large numbers of requests serially with be frowned upon unless you put some delays between each request.

        My advice is: stick to one thread per vendor and a simple serial loop over the request to each vendor. Once you have that working, you can benchmark and look for opportunities to improve performance if it proves necessary.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re^3: Do I need to use Coro instead of threads/forks
by Corion (Patriarch) on Sep 29, 2014 at 12:59 UTC

    From what I understand of your problem, using Coro should work. Basically, your program should be something like your threads example, except that you use Coro::LWP and you use Coro instead of OS threads.

    What have you tried so far that gives you problems?

    Maybe you want to start with a small example program that does not talk to your vendor but tries to make several parallel requests to Google?

      I have created a test where i removed vendors and put in Coros instead of threads both in main and worker threads.
      The script works fine even on actual data (tried with only two vendors i.e. 2 main threads and thread 1 opening 2 more inside it).
      I am observing time taken to complete output and have not seen any improvement so far.
      Further testing needs to be done to see if it is fetching data in parallel.
      In 3 iterations it took 12 to 17 sec to finish.
      With OS threads it should be around 10 to 12 sec. (need to confirm)

        Note that you cannot present all data until the last vendor has replied.

        If you want to present information as it becomes available, you will need to add Javascript code on the browser side to read the information as it becomes available and progressively display the information to the user.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1102347]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (3)
As of 2024-04-25 09:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found