Is "/usr/bin/sort -u" really so dumb as to sort all of the duplicates and then only eliminate them at the end?Yes.
A little history. Not so long ago, memory was expensive. Computers didn't have multi-Gb RAM available. At best, they had a few Mb, to be shared with dozens, if not hundreds, of users and their programs. sort was written with this mindset: sorting huge amount of data, while being able to do this without consuming heaps of memory. To eliminate duplicates without first sorting, you need something like a hash - which either means keeping all the data in memory, or use really complex structured files.
A lower bound would be O(D+U*log(U)) and I suspect that is a pretty close bound.Can you eliminate duplicates in ο(N * log N) worst case? (That's little-oh, not big-oh) If not, the O(D * log D) is a tight lowerbound.
In reply to Re^6: In-place sort with order assignment (-u)
by JavaFan
in thread In-place sort with order assignment
by BrowserUk
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |