Where each 'letter' of the string is a category and thus each letter has a different range of possibilities, and the number of categories is the length of the string.
I need to determine the furthest distance two selections are from a list of possible selections.
That bit -- two selections from a list of possible selection -- is muddy. But if we ignore that and only consider this bit:
In other words, comparing every selection in the list against every other selection in the list.
What you are describing is a full cross-comparison using a variation on the 'edit distance'.
The bottom line for which I think is that because you are seeking a relative measure, there is no way to avoid a full cross comparison.
This is because the difference between the first letters is equally significant to those between the last; thus any attempt to sort the strings; or to reduce the strings to numerical values so that they may be ordered with a view to reducing the number of comparisons; fails.
Maybe if you encode each category as a set of bits, and combine those bits into bit strings, the individual comparisons reduces to a matter of xoring the pairs of strings and counting the number of bits in the result.
That should prove to be substantially faster than 5 to 10 individual subtractions and summing of the results.
But I think the best you can hope for is to improve the per-pair comparison cost; rather than there being any way to reduce the number of per-pair comparisons that must be done. (I think :)
In reply to Re: Help thinking about an alternate algorithm
by BrowserUk
in thread Help thinking about an alternate algorithm
by Limbic~Region
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |