I see what you mean. In fact, I was toying with the idea of the following architecture to maximise throughput:
* Setup asycnronous DNS to quickly resolve all of the 90,000,000 domains to find out which domains reside at the same IP (shared hosting domains).
* Send out a TCP ack to each web server (asynchronously) to get a shortlist of which domain names actually have a web server which responds (should send a RST in response to unexpected ACK).
* Then send out TCP connects with a short timeout to short list servers which respond faster.
* To those that respond fast enough send out HEAD requests to obtain document sizes.
* Asynchronously GET the smallest documents first such that database of links experiences the fastest possible growth. Any thoughts?
In reply to Re^18: Async DNS with LWP
by jc
in thread Async DNS with LWP
by jc
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |