in reply to Async DNS with LWP

I have a module that replaces LWP's HTTP and HTTPS backend with AnyEvent::HTTP allowing you to easily do parallel requests using Coro threads. I'll try to publish it tonight or tomorrow night.
my @threads; for (...) { push @threads, async { ... do LWP stuff here ... }; } $_->join() for @threads;

Update: Oh yeah, Coro already provides some kind of support for HTTP through LWP, but it's hackish and it doesn't work with HTTPS.

Replies are listed 'Best First'.
Re^2: Async DNS with LWP
by jc (Acolyte) on Oct 05, 2010 at 19:42 UTC
    That sounds great. AnyEvent::HTTP with Coro is just about the conclusion I've arrived at and I'm making some progress with it. So now I'm wondering if your changes can be ported to WWW::Mechanize... That would certainly make developing stateful crawling a lot easier.
      WWW::Mechanize doesn't actually do any socket work. It lets LWP do it, so nothing needs to be done. Keep in mind that Coro is cooperative multitasking, so your sockets can't receive anything if your crawler is spending a lot of time not waiting for data.
        Sounds great! So, we just replace the LWP module with your AnyEvent::HTTP / Coro version and things should work for Mechanize out of the box? Not sure I see what you mean by your point about Coro. If my crawler isn't spending any time waiting for data I will be extremely happy that it is crawling as fast as my network connection allows.