in reply to Re: Duplicate detection (SQL)
in thread Duplicate (similarity) detection (SQL)

I agree. Calling this node "duplicate detection" was actually a bit of a misnomer (duplicate is the term we use in our system to describe records that are similar enough that they should be marked as duplicate) so I renamed it to "similarities detection."

I'm not sure if this will be possible to do at all because running a Search::Similarities search on every entry in the db (there could be thousands in a large database) against the target query would likely take far too long. I may try it in a quick spike solution and see if it's reasonable, but does anyone else have any better ideas?