and if there's dodgy input as indicated by someone, e.g. Group-One is equivalent to Group One and GroupOne...then just run it thru "sed" with a regex to make them all conform to a desired one form, before the de-dupe.
the hardest line to type correctly is: stty erase ^H