in reply to ...How to parse search engine results fast?

The other metasearch engines may have bigger hardware and a lot more bandwidth than you do. They might also cache results. Web scraping is a straightforward task, so I doubt they're doing anything inheirtantly faster than you are (except, maybe, using a faster HTML parser).

"There is no shame in being self-taught, only in not trying to learn in the first place." -- Atrus, Myst: The Book of D'ni.

  • Comment on Re: ...How to parse search engine results fast?

Replies are listed 'Best First'.
Re^2: ...How to parse search engine results fast?
by A200560 (Novice) on Feb 03, 2005 at 15:53 UTC
    Ciao, I don't think so, for example why dogPile download 200 results from 4 different search engines in 1 second and with my P4 1.7 GHZ, 512 MB RAM, 100Mbit (totally free machine) I download from google top 100 in 1,5 sec?


    Hardware matter in a high load envirnment...

      Obviously, dogPile doesn't submit a search to each of the search engines every time you enter something into dogPile. For one, that would be very foolish for speed (as your problem is showing). Surely dogPile saves the results it fetches from the engines and reuses that the next time someone else queries for the same search.

      So for example, you search for 'hello world'. dogPile sees that these search terms haven't been fetched before, so dogPile queries all the search engines. Next time someone searches for 'hello world', dogPile doesn't need to refetch the search results since it cached them on its own servers.