I would start here, Building a Vector Space Search Engine in Perl. Then maybe look into Lucy, Search::Elasticsearch, and Search::Tools and the many tangents you will encounter at each junction. This is a fun but deceptively deep problem space. Stemming, tokenizing, substrings, case, encoding, the actual definition of what a word/token is, that nothing any more is plain text to start with but some kind of markup or document format… and making it work with speed and reasonable scoring is incredibly difficult despite the fact that a vanilla inverted index or vector search is not that hard.
Have fun. :P
In reply to Re: Hash Search Ranking
by Your Mother
in thread Hash Search Ranking
by hoyt
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |