http://qs1969.pair.com?node_id=176121


in reply to Re: Perl Search Applicance
in thread Perl Search Applicance

As mattr has pointed out, have a look at namazu.org . Their crawler seems also to index pdf pages.
I would recommend you, to try out or read the code of the other search engines mattr provided in his post, before starting writing your own one. You will get a lot of usefull ideas from them.

As for the database, I currently use mysql (unfortunately.., it's slow). I give each word a unique id, and then split the words found in the documents over several tables, so the tables won't get too large.