Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Um..this must be a moronic question, but where can I find
some sample code on how to do a search engine (ex: to browse
through the files on the server (I have a solaris
machine) and return a generated page of all entries found?
Like the search button on this site, for example.

Replies are listed 'Best First'.
Re: Search button
by takshaka (Friar) on Jun 10, 2000 at 07:18 UTC
    Ugh. Stay away from Matt Wright's stuff, please.

    merlyn has a couple of old Web Techniques columns that should get you started. You'll also find lots of search scripts of varying quality at cgi.resourceindex.com.

      What is wrong with Matt Wright's stuff ?

      Not that I mind or use it, but I'd like to know why you think it is not for recommendation on Perlmonks. Is the code bad quality ? Is it badly documented ? Has it gasp security holes ?

        i can't say first hand, but anecdotally, it is "all of the above", but most notoriously the latter.
Re: Search button
by chromatic (Archbishop) on Jun 10, 2000 at 07:09 UTC
    O'Reilly's CGI Programming with Perl has a chapter on searching your web sites. The upcoming second edition is pretty good (but it won't be published for another month or so). Have a look at the previous version for some ideas.

    The search button here at Perl Monks is a little different. Since all nodes are stored in a database, the query goes to a SQL statement searching the comprehensive node index for titles similar to the query.

(jcwren) Re: Search button
by jcwren (Prior) on Jun 10, 2000 at 06:43 UTC
    This can be answered several ways. Do you need to search the site dynamically (each time the user clicks search), or can you run a cron job that periodically builds the indexes? The first method is rough on the server if you have a lot of hits. The second method doesn't work as well when you have a site with high dynamic content.

    There's also the question of do you to seach on ANY text in the web page, only META tag information, support boolean logic searchs, fuzzy matches, sub-searches, etc.

    I'm sure there are better sources, but Matt's Script Archive has an OK basic search utility.

    I've seen this question a lot, and the answer is very dependant on your needs. There are, by the way, a least two dozen free search scripts out there. I don't have URLs for them, but a search in Alta Vista should turn them up for you.

    --Chris