in reply to Finding Nearly Identical Sets
If I understand your requirements correctly, you can use your approach in a pragmatic way, because any "neighboring" multi sets must have at least 8 digits in common.
So
At the end you'll only need 9 hash look ups to drastically narrow down potential candidates.
NB: That's a pragmatic approach, a detailed survey might show more efficient algorithms.
HTH :)
PS: this problem reminds me of hamming distance of error correcting codes, but I doubt you can easily apply this here.
Cheers Rolf
(addicted to the Perl Programming Language and ☆☆☆☆ :)
Je suis Charlie!
I just realized that you already sketched that approach in Re^2: Finding Nearly Identical Sets . Not sure why you say it's ugly, cause a HoH should be quite fast, and you'd need to check anyway, if your input is equidistant to multiple neighbors.
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Finding Nearly Identical Sets
by Limbic~Region (Chancellor) on Sep 29, 2016 at 11:54 UTC | |
by LanX (Saint) on Sep 29, 2016 at 20:53 UTC |