Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re: Re: Perl Search Applicance

by Bluepixel (Beadle)
on Jun 20, 2002 at 19:26 UTC ( #176121=note: print w/replies, xml ) Need Help??


in reply to Re: Perl Search Applicance
in thread Perl Search Applicance

As mattr has pointed out, have a look at namazu.org . Their crawler seems also to index pdf pages.
I would recommend you, to try out or read the code of the other search engines mattr provided in his post, before starting writing your own one. You will get a lot of usefull ideas from them.

As for the database, I currently use mysql (unfortunately.., it's slow). I give each word a unique id, and then split the words found in the documents over several tables, so the tables won't get too large.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://176121]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (2)
As of 2022-12-03 23:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?