Re: looking for a good Perl-way for implementing website search

Replies are listed 'Best First'.
Re^2: looking for a good Perl-way for implementing website search by davorg (Chancellor) on May 09, 2006 at 15:32 UTC
The nms search program is a reimplementation of Matt Wright's program and we don't call it simple search for nothing. If you're considering things like Plucene, then simple search will almost certainly not be powerful enough for you. -- <http://dave.org.uk> "The first rule of Perl club is you do not talk about Perl club." -- Chip Salzenberg	[reply]
Re^2: looking for a good Perl-way for implementing website search by gellyfish (Monsignor) on May 09, 2006 at 15:33 UTC
The NMS Simple search is constrained by the requirement that it must be installable on an average shared hosting account with no additional modules or shell access and that it can be dropped in as a direct replacement for the MSA Search. Basically it greps the content of the files every time it makes a search - this obviously isn't the ideal way to do it. There is a TODO item to implement a search that doesn't need to be compatible with the MSA one but, y'know, time .... /J\	[reply]
Re^2: looking for a good Perl-way for implementing website search by punkish (Priest) on May 09, 2006 at 15:44 UTC
My original home-grown solution was actually much better than nms simple search. I actually had a table of inverted index of words, a table of stop words, and a master dictionary table (essentially SQL-constructed from the standard dict that comes with Unix) for avoiding duplication. As I said, worked very well except for stemming and fuzzy searches, but required a SQL db. nms will be going backward. I do have Plucene implemented now as an experimental mechanism. Would like to hear it compared to Kinosearch? Additionally, as I said in my OP, Plucene is pretty sparely documented. It took a lot of digging around, and I still don't know, for example, how to score relevancy. Surely, Plucene can't be the canonical Perl website search mechanism if it is so (IMO) sparely documented, and the last update was almost a year ago. Kinosearch logs show some recent activity, but fwiw, Plucene is "1.14" version numbers ahead ;-). -- when small people start casting long shadows, it is time to go to bed	[reply]