in reply to internet search engine
First, you will need a fast, high-capacity datastore. Probably not an SQL database; you'll want something very fast, very expandable, and very dependable, with a lot of internal redundancy. Look into the NoSQL key-value datastores, or you may decide you need to write one. Blekko wrote one.
Second, you will want fast indexing and a way to quickly run queries across the whole of your database; something like Hadoop or BigTable. Blekko wrote one.
I'm leaving out the vast majority of details here because they're proprietary information, but the summary is that you'll need a big (hundreds of machines), fast datastore to store your crawl and index, and a good mechanism to access it quickly. Blekko wrote all these.
It's taken Blekko 4 years to get to where they are now (with is the "pretty darn good, better than Google some places, not as good others" with about 20 people (though they started with about 7 or 8). You're in for a long-haul process, and your backers will need to be patient. Writing a search engine is not easy, and will go better if you have folks who have already worked on one for a while onboard.
Crawlers are easy; search engines are hard.
|
|---|