There is a good reaason why search engines don't allow full regex searches--they are simply too slow.

Yep. The way we typically implement a regex query against an inverted index is...

  1. Scan the over the whole term dictionary looking for terms that match the regex.
  2. Iterate over the posting lists (enumeration of doc ids that match) for all matching terms.

If too many terms match, that could end up being slower than a full table scan. Depending on implementation and index size, you could also end up running out of memory (e.g. if the posting lists are all iterated concurrently). Futhermore, that algo limits the scope of regex matches to individual terms.

Getting good performance out of indexed data is all about planning what queries you need in advance. Regexes are so flexible that they're hard to plan for.


In reply to Re^2: What DB style to use with search engine by creamygoodness
in thread What DB style to use with search engine by halfcountplus

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.