Another pet project of mine has recently branched to involve trying to determine whether sentences are "equivalent", that is, asking or referring to the same thing, despite different grammatical structures and so forth. Obviously this is a Very Hard Task and I'm not really looking to solve it perfectly. That would be very awesome though, but rather unlikely.

Anyway, my general idea at the moment basically involves stripping stop words, stemming the remainder and comparing the resultant set with my target set of the other sentence. This will probably be fairly fast for the small case and it's easy to think of and implement, but it has some flaws, for example, how do I store the resultant sets so I could easy reference them again? If I've got hundreds of thousands of these sets and I want to find the ones that match a new set I just created, how do I do that?

So that's my idea. Anyone else have any useful ideas? Pointers to research? Clever algorithms?

In reply to Theory time: Sentence equivalence by BUU

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.