How about parsing the quote into grammatical pieces? You could then compare the parse trees for similarity in addition to looking at what text has changed.
If you can't parse the sentence at all, that's a good hint that it is spam. Otherwise, having a tree handy would make it easy to provide hints and highlights for the human editor to look at, and save their time.
In reply to Re: Verifying a quote matches (closely enough) a source URI
by SuicideJunkie
in thread Verifying a quote matches (closely enough) a source URI
by Your Mother
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |