http://qs1969.pair.com?node_id=91433

swiftone has asked for the wisdom of the Perl Monks concerning the following question:

I need to be able to search a block of text to see if a given question is in there, with broad flexibility for different ways to state the question.

My workplace has a problem with too many people asking FAQs by email. To try and free up staff time, here's my plan:

  1. John Doe comes to our website and clicks on the "send comments and questions" link.
  2. John Doe fills out a form with contact info and a text block for comments and questions.
  3. When "submit" is clicked, the input is checked against a list of FAQs.
    • If there are no matches, the form is emailed to a customer service rep.
    • If there is a match, the matching Q&A (or a link) is returned to the user with an appropriate blurb. The user can then either confirm that the request was not answered (which results in the form being emailed), or leave happily.
The problem is determining how to best do this. I could try to compare sentences to the FAQs using String::Approx, but that would likely match strangely, and would be baffled by the lack of punctuation our customers often use.

I could go with a keyword search, but that requires that we add keywords to the FAQ list we have, not to mention that keyword isn't such a great way to match FAQs.

In general, I'm willing to learn towards more false matches than not. Any ideas?