I don't know whether this is relevant, but you might look at KinoSearch -- at least for ideas. Basically, you need some sort of process that will read through your set of documents and build an index to identify all the locations of all possible keywords. Then you need a separate query process that knows how to read the index data, and how to use the information provided there to locate the specific documents that meet specific conditions on particular keywords.

A database solution would probably work okay, but people have built "search engine" apps that are better optimized for this kind of task. KinoSearch (which I personally have not used) is one such engine, built with Perl and C.


In reply to Re: how to parse a query by graff
in thread how to parse a query by arunmep

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.