2. How to display the results as they are available from any source? application must not wait for all.

3. How to merge all results, when I am asking for Incremental display.

You need to employ the services of a seriously experienced web architect. Serving HTML incrementally requires detailed knowledge of both the webserver and the browsers you are seeking to target. I have neither.

Fetching the data from 50 sources concurrently and merging the required results back together is relatively trivial. One thread per source and a common queue. Results are posted to the queue by the threads and the cgi thread reads it off, formats it and serves it.

The difficult part is the web handling. HTTP is a request-response protocol. The browser sends a request; the server sends a response; the browser displays it. And the server won't send anything else until the browser sends another request. So to display results incrementally, you have to arrange for the browser to re-request to get updates.

That can be done with meta tags, javascript or by having the user hit refresh, but then the new response will overwrite the browser's display obliterating what was sent the first time. So for the user to see the results build up incrementally, you have re-send any results you sent the last time plus any additions. But that means that the server has to remember what it sent--and to whom. But as HTTP is connectionless, that means having a means to identify each user and persistent storage to record what to send to whom. And how you go about doing that will depend upon what web server you use; what session mechanism you use; what persistent storage you have; what web-app software/framework/development tools you use. etc. etc.

There's also the problem of how your webserver is going to handle running 500 concurrent Perl threads? From my very limited understanding of Apache, it doesn't like (Perl) threads much. Less so if you are also using mod_Perl or FastCGI.

If I were trying to do this, I would have the webserver hand-off the query to a dedicated Perl process. Something like this:

  1. Webserver receives a request for the query form and serves it.
  2. When it receives the completed form, it validates the query and if it is good, it spawns a separate instance of Perl. Passing the query parameters and retrieving a port number that the new Perl instance will listen on.
  3. It then send the browser a redirect to that port number.

    The browser is now talking directly to a Perl instance dealing with its particular query.

    That Perl instance starts the 50 threads and issues the requests.

  4. When the Perl process receives the redirected request on the port it opened, it formats any results it has received so far and sends the HTML with a meta refresh tag.
  5. Each time the refresh request is received, it adds any new results, to those accumulated last time, and re-sends the response.
  6. Repeat till done.
  7. Redirect the user back to the web server.
  8. Terminate the Perl process.

This way, as each query is being serviced by a dedicated Perl instance, there is no possibility of mixing up the users and no need for persistent storage that would need to be cleaned up. When the user quits or the session times out; the process terminates and everything is cleaned up automatically.

But I'm not a web guy, so take that with a huge pinch of salt and pay someone for their advice and knowledge. Choose them carefully.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.
"Too many [] have been sedated by an oppressive environment of political correctness and risk aversion."

In reply to Re^5: Parallel Search using Thread::Pool by BrowserUk
in thread Parallel Search using Thread::Pool by shanu_040

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.