cntrtrst has asked for the wisdom of the Perl Monks concerning the following question:
Hello monks. I'm looking for some quasi-semantic wisdom on the following problem. I'd like to compare sentences to each other to see if they're the same, but making allowances for added/missing words or typos.
This maybe isn't a perl question in the strictest sense, but I'll be using perl to do it. I've considered various forms of diff, including WordDiff which is nice but not quite what I'm after. The algorithm I'm considering now does spot checks of substrings at random indices, but that also raises hard to answer questions about what constitutes an acceptable margin of error and I'm not sure if it will work very well in the wild.
The purpose is to get incoming text streams and compare them to a template to determine if the person is using the template or deviating from the template. In this application, people will be allowed and even encouraged to deviate from the template they're given to write, but I want to be able to determine when that's happening in real time.
One thing that should make the problem easier is that users should be either attempting to copy the template or clearly doing something else. The two behaviors should be quite clearly distinct and, to the eye, would be easily distinguishable. However, a human reader can judge the meaning of the sentence being evaluated and I think that's actually the first line of analysis that informs the rest (such as noticing typos).
Any general thoughts on algorithms to approach this problem with will be appreciated. Thank you very much.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: comparing sentences
by Your Mother (Archbishop) on Oct 29, 2016 at 19:30 UTC | |
Re: comparing sentences
by Albannach (Monsignor) on Oct 29, 2016 at 19:22 UTC | |
Re: comparing sentences
by BrowserUk (Patriarch) on Oct 29, 2016 at 19:35 UTC | |
Re: comparing sentences
by AnomalousMonk (Archbishop) on Oct 29, 2016 at 19:47 UTC | |
Re: comparing sentences
by tybalt89 (Monsignor) on Oct 30, 2016 at 16:49 UTC | |
Re: comparing sentences
by kcott (Archbishop) on Oct 30, 2016 at 14:10 UTC | |
Re: comparing sentences
by cntrtrst (Initiate) on Oct 30, 2016 at 05:41 UTC | |
A reply falls below the community's threshold of quality. You may see it by logging in. |