JayBee has asked for the wisdom of the Perl Monks concerning the following question:

I'm looking for some options to be able to search from my own server.
I 've considered using file globbing along with the open(file) and read untill the local $/ = $matches_query; but there must be something else, right? Could I get a few examples please?

Thank you in advance.

Replies are listed 'Best First'.
Re: Searching within my site
by Your Mother (Archbishop) on Oct 23, 2004 at 04:15 UTC
Re: Searching within my site
by tachyon (Chancellor) on Oct 23, 2004 at 08:53 UTC

    If you mean search the server file system locate, grep -R, find are your command line friends - man widget for details. If you mean a search engine for your website I suggest swish-e which has a C engine and a nice Perl API (as well as a pre-written perl search CGI). This is used on site like Apache.org so you could do a lot worse. There is also the GNU htdig engine to consider

    cheers

    tachyon

Re: Searching within my site
by PodMaster (Abbot) on Oct 23, 2004 at 09:11 UTC
Re: Searching within my site
by dpavlin (Friar) on Oct 23, 2004 at 14:35 UTC
    For really interesting articles how searching your site might be different than searching internet see ACM Queue issue on Enterprise Search. This will provoke you to ask question like:
    • is there hierarhical (taxonomy) data about site?
    • is there more meta data or structure in your site than supported by out-of-the-box solution?
    • do I want to write my search engine?
    My site isn't enterprise, but getting wider picture is always beneficial.

    I could also recommend Xapian search engine which comes with Omega indexer and cgi search programs (written in C++), but there is also Search::Xapian.


    2share!2flame...
Re: Searching within my site
by tilly (Archbishop) on Oct 24, 2004 at 02:05 UTC
    And then there is the tried and true, "Ask Google to index it and then use Google's search." It always amazes me how many sites put energy into implementing searching software that would be better implemented that way.
      Precisely. Google visits my newly minted pages within two days, and that's good enough for me. So I have the following at the bottom of my pages:
      <form action="http://www.google.com/search" method=GET> <INPUT TYPE=hidden name=site value=swr> <INPUT TYPE=hidden name=q value="site:stonehenge.com"> <INPUT TYPE=text name=as_q size=31 maxlength=256 value=""> <INPUT TYPE=submit name=btnG VALUE="Search stonehenge.com with Google" +> </form>
      Just change stonehenge.com to yoursite.example.com and you're off and running! I got the values from looking at the current search pages, so it may break in the future, but so far, it's working fine.

      -- Randal L. Schwartz, Perl hacker
      Be sure to read my standard disclaimer if this is a reply.

Re: Searching within my site
by davebaker (Pilgrim) on Oct 25, 2004 at 21:20 UTC
    I have been delighted with a Perl-based commercial solution that doesn't break the bank but that is more powerful than anything I've seen in the freeware/shareware area (and I think I've checked 'em all):

    http://northernlight.com/engine.html

    (the Northern Light Enterprise Search Engine).

    I'm still configuring it, but it's running in a sort of beta state on my site at

    http://benefitslink.com/search