My original home-grown solution was actually much better than nms simple search. I actually had a table of inverted index of words, a table of stop words, and a master dictionary table (essentially SQL-constructed from the standard dict that comes with Unix) for avoiding duplication. As I said, worked very well except for stemming and fuzzy searches, but required a SQL db. nms will be going backward.
I do have Plucene implemented now as an experimental mechanism. Would like to hear it compared to Kinosearch? Additionally, as I said in my OP, Plucene is pretty sparely documented. It took a lot of digging around, and I still don't know, for example, how to score relevancy. Surely, Plucene can't be the canonical Perl website search mechanism if it is so (IMO) sparely documented, and the last update was almost a year ago. Kinosearch logs show some recent activity, but fwiw, Plucene is "1.14" version numbers ahead ;-).
--
when small people start casting long shadows, it is time to go to bed
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.