in reply to How to do Screen Scraping in parallel

There are many solutions to this problem, but which ones are best will depend on how you are displaying the information.

If you're doing a web page, what you should do is fork off processes and background them, then return a web page and have ajax requests return and poll for the data. (It is up to you to figure out a way to communicate between the forked processes and the web page.)

If you're doing a GUI you can use a similar approach with polling for data. Alternately you can explore the world of asynchronous programming. Which will lead you to modules like POE. Or you can explore the complications of multi-threaded programming (which in this case is going to be similar to the fork approach, except that you've got threads rather than programs).

If you've got a command line or a batch program, treat it like the GUI.

  • Comment on Re: How to do Screen Scraping in parallel

Replies are listed 'Best First'.
Re^2: How to do Screen Scraping in parallel
by shanu_040 (Sexton) on May 26, 2009 at 05:28 UTC
    Hi,
    I am using web-page. program is successfully able to do parallel screen scraping using Parallel::ForkManger. But the thing that causing problem for me is wait_for_all in PFM. My program collects the data and put them into a hash but, I do not have any idea, How can I use ajax request to display the content form that very hash at the same time. Any idea would be helpful.
    Shanu