in reply to Re^2: Generate unique ids of maximum length
in thread Generate unique ids of maximum length
Thanks for the explanation, I've got it. (Then consider my node as an explanation to ikegami's node.)
It's funny how many different ways people interpret similarity/resemblance, because that was exactly the reason why I chose to keep all the optional characters if the id already fit in the char limit. That way my code always keeps the under-limit ids identical (== more similar). Of course in other cases that's not the optimal choice.
I also thought of (but not implemented) a more generic way to decide which character to drop from the original id: provide the user a filter callback in which s?he can rate the characters (or substrings) considered, then drop the ones with the lowest rating (still from right to left). For example: [_ ] => 3, [A-Z] => 2, [a-z]=> 1, anything else => 0
And this is why I've collapsed the char-level suffix tree to the substring-level: to ease the access to substrings for the purpose of rating. And also because the structure of the tree in the substring-level form cannot interfere with the selection of (non-)ambiguous characters (as in choroba's remark above if I get it right).
Cheers
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^4: Generate unique ids of maximum length
by ikegami (Patriarch) on Apr 13, 2010 at 22:28 UTC |