in reply to Help thinking about an alternate algorithm
Where each 'letter' of the string is a category and thus each letter has a different range of possibilities, and the number of categories is the length of the string.
I need to determine the furthest distance two selections are from a list of possible selections.
That bit -- two selections from a list of possible selection -- is muddy. But if we ignore that and only consider this bit:
In other words, comparing every selection in the list against every other selection in the list.
What you are describing is a full cross-comparison using a variation on the 'edit distance'.
The bottom line for which I think is that because you are seeking a relative measure, there is no way to avoid a full cross comparison.
This is because the difference between the first letters is equally significant to those between the last; thus any attempt to sort the strings; or to reduce the strings to numerical values so that they may be ordered with a view to reducing the number of comparisons; fails.
Maybe if you encode each category as a set of bits, and combine those bits into bit strings, the individual comparisons reduces to a matter of xoring the pairs of strings and counting the number of bits in the result.
That should prove to be substantially faster than 5 to 10 individual subtractions and summing of the results.
But I think the best you can hope for is to improve the per-pair comparison cost; rather than there being any way to reduce the number of per-pair comparisons that must be done. (I think :)
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Help thinking about an alternate algorithm
by Limbic~Region (Chancellor) on Jan 15, 2014 at 16:55 UTC | |
by BrowserUk (Patriarch) on Jan 15, 2014 at 17:02 UTC |