In a nutshell you are trying to find the best solution to several different problems:
  1. How to index a site? (Should the index be as compact as possible? Or maybe several different indices for different portions of the site for faster access?)
  2. How to retrieve user query and return proper results from the index? (Do we offer phrase searching? Just keyword searching? Is the relevancy determined by keyword frequency or something else? Are the files all HTML?)
  3. How to return the results to the user? (Display the page title and URL? What additional information needs to be displayed, like excerpt from the page?)

Perhaps #3 is the easiest portion provided you have the right index generated, but reading the tutorials and coming up with more concrete definition of a problem you're trying to solve should help.

Also, if Google has your site indexed in its entirety and frequently crawls it, it's not necessary to use their Web form freebie. You can always use Google API for full-blown searches (although that would limit you to 1,000 searches per day).


In reply to Re: Building a search engine by Anonymous Monk
in thread Building a search engine by artist

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.