I believe that O(0.5 * (N + U) * log N) is a good approximation of the complexity of a combined mergesort-unique algorithm.While O(0.5 * (N + U) * log N) isn't incorrect, given that 0 <= U <= N, O(0.5 * (N + U) * log N) and O(N log N) are the same. (That is, any function that's in O(0.5 * (N + U) * log N) is also in O(N log N) and visa versa.)
The logic behind is that there are log N merge steps to perform. In the lower steps the probability of duplicates is very low, so the number of comparisons will be proportional to N.Actually, for O(N log U) to be different from O(N log N), U must be o(N)†. That is, even if only 1 in 1000 elements is unique, O(N log U) is equivalent with O(N log N) (after all log U == log(N/1000) == log(N) - log 1000). So, for a set where O(N log U) is different from O(N log N), the chances of two random elements to be the same is actually pretty high.
†I think U should even be o(Nε) for all ε > 0.
In reply to Re^14: In-place sort with order assignment (runs)
by JavaFan
in thread In-place sort with order assignment
by BrowserUk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |