Well, this has been an interesting day. Thank you for taking the time to discuss this with, both in your posts and in the chatterbox.

Yes, I'm aware of merlyn's articles. The first one didn't seem completely appropriate for it doesn't use indexes (apologies for daring to criticize, but) and the second presumes that your site is interesting to the search engines (and therefore visited). Mine isn't. Yes: I've submitted it, there's a robots.txt file, there are carefully chosen META tags and keywords. So far, they've come, but not listed me. So, that lets out the second approach.

I'm not trying to reinvent the wheel. I'm trying to find a right-sized wheel that fits my needs.

I looked at the suggested packages. One wants more money than I can afford (yes, there's a crippled version for free however their licensing seems a bit screwy.)

Another is open source, but it's written in C. I'd prefer to find a Perl solution if possible, so I can learn from it.

The links provided by lemming are promising. I'll try to work something out of those.

baku's sample is interesting, but is taking my album example a little too seriously. :)

In reality, I'm looking to index a large number of free-form text documents and a companion program to search those indexes, preferably something that uses proper style. For example, something that uses warnings, strict, and taint mode.

I'd really appreciate it if this companion also provided support for soundex, word proximity, and root words, e.g. knowing that "search" should hit "searching," "searches," and so on.

And, most important, I'm looking for something that you folks respect. I really don't want to have to try to rewrite stuff from Matt's Script Archive. Not only am I not that experienced, but I'm not sure I'd know where to start (other than the bits I already mentioned).

Update: I just realized that you might think I'm asking you folks to write this. I'm not, but I am asking if such a thing has already been written.

Again, thanks for your assistance.


In reply to Re: Searching module by ZydecoSue
in thread Searching module by ZydecoSue

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.