in reply to Offsite Perlmonks Search Engine

Is it possible to include some sort of stemming hash to match conjugal word groupings, or classes of similar words?

For example, searching for "parse" will not match "parsing", etc.

I know it's extra overhead, but perhaps cached hash results from something like Lingua::Stem or Lingua::EN::Infinitive?

rob_au sparked an interesting node on stemming earlier this year (Natural Language Index Stemming) that might be helpful as well.

Thanks for the effort!

Matt

Replies are listed 'Best First'.
Re: Re: Offsite Perlmonks Search Engine
by Elian (Parson) on Jul 08, 2002 at 01:20 UTC
    Stemming's a rather processor-intensive thing, and a pain to get right in general, but it could work on perlmonks. (Having the advantage of a reasonably small data set and content pretty much exclusively in english) It'd be interesting to give it a shot, though.