The items being sorted on are strings. It seems to me like merging them and sorting them together before scanning to remove dupes would make this slower than the hash approach, especially since I may have to go to disk if the combined size is a couple of million entries or so. I guess I should really bench it.