Dear Monks,

I know many there are many cool indexer, search engines, etc. out there but I don't know if any of them is perfect for my system. My system is not a very unique one but a usual hybrid system which stores text, html, xml, etc. content in the filesystem and metadata about these files (which group they belong, which template script will be used to render them, etc.) in the database.

For example assume that I've got this kind of file hierarchy:

/10/1/1345.html
/10/1/1346.html
/11/7/6544.html

1345, 1346 and 6546 are ID numbers for the content and assume that I've got DB records like that:

ID Type Course Week
1345 1 10 1
6544 5 11 7
1346 5 10 1

So users don't see 1345.html, or 1346.html but according to content type their URL is something like that:

http://www.blabla.com/ContentPage?ID=1345
http://www.blabla.com/DiscussionPage?ID=1346
etc.

This means that the Indexer & Search system must take that into account, it is not a simple `WORD -> THIS_FILEŽ structure but something that needs more transformations according to rules that IŽll provide to system.

Since Perl is the one to rule text processing, manipulation, etc. I'll be glad if there are fellow monks who encountered a similar situation and found a perfect solution. Please enlighten me with your experience and vision...

In reply to Perfect Indexer & Search Engine by YAFZ

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.